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(54) Polyester synthase gene and process for producing polyester 

(57) The present invention relates to a polyester 
synthase gene coding for a polypeptide containing the 
amino acid sequence of SEQ ID NO:2 or a sequence 
where in said amino acid sequence, one or more amino 
acids are deleted, replaced or added, said polypeptide 
bringing about polyester synthase activity; a gene 
expression cassette comprising the polyester synthase 
gene and either of open reading frames located 
upstream and downstream of said gene; a recombinant 
vector comprising the gene expression cassette; a 
transformant transformed with the recombinant vector; 
and a process for producing polyester by culturing the 
transformant in a medium and recovering polyester from 
the resulting culture 
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Description 

Field of the Invention 

5 The present invention relates to a polyester synthase gene, a recombinant vector containing the gene, a transform- 

ant carrying the recombinant vector, and a process for producing polyester by use of the transformant. 

Background of the Invention 

10 ft is known that a large number of microorganisms biosynthesize poly-3-hydroxybutyrate (P(3HB)) and store it in 
the form of ultraf ine particles as an energy source in the body. P(3HB) extracted from microorganisms is a thermoplastic 
polymer with a melting temperature of about 180 °C, and because of its excellent biodegradability and biocompatibility 
it is drawing attention as "green" plastic for preservation of the environment Further, P(3HB) is "green" plastic which 
can be synthesized from regenerable carbon resources including sugars and vegetable oils by various microorganisms. 

is However, P(3HB) is a highly crystalline polymer and thus has the problem in physical properties of inferior resistance 
to impact, so its practical application has never been attempted. 

Recently, polyester P(3HB-co-3HH) as a random copolymer of 3-hydroxybutyrate (3HB) and 3-hydroxyhexanoate 
(3HH) and a process for producing the same have been studied and developed, and these are described in e.g. Japa- 
nese Patent Laid Open Publication Nos. 93049/1993 and 265065/1995 respectively In these publications, the P(3HB- 

20 co-3HH) copolymer is produced from alkanoic acids or olive oil by fermentation with Aeromonas caviae isolated from 
soil. It is revealed that because the degree of crystailinity of the P(3HB-co-3HH) copolymer produced through fermen- 
tation is reduced with an increasing ratio of the 3HH unit in it, so that the copolymer becomes a soft polymeric material 
excellent in thermostability and formability and can be manufactured into strong yarn or transparent flexible film (Y. Doi, 
S. Kitamura, H. Abe, Macromolecules 28, 4822-4823 (1995)). However, the yield of polyester (content of polyester in 

25 dried microorganisms) according to the processes described in Japanese Patent Laid Open Publication Nos. 
93049/1993 and 265065/1995 is low, and thus there is demand for developments in a process for producing the copo- 
lymerized polyester P(3HB-co-3HH). 

Summary of the Invention 

30 

The object of the present invention is to provide a polyester synthase gene, recombinant vectors containing the 
gene, transformants transformed with the recombinant vectors, and processes for producing polyester by use of the 
transformants. 

As a result of their eager research, the present inventors succeeded in producing the polyester in high yield by clon- 
35 ing a polyester synthase gene and deleting one or both of open reading frames located upstream and downstream of 
said gene to arrive at the completion of the present invention. 

That is, the present invention is a polyester synthase gene coding for a polypeptide containing the amino acid 
sequence of SEQ ID NO:2 or a sequence where in said amino acid sequence, one or more amino acids are deleted, 
replaced or added, said polypeptide bringing about polyester synthase activity. Said gene includes those containing e.g. 
40 the nucleotide sequence of SEQ ID NO:1 . 

Further, the present invention is a gene expression cassette comprising said polyester synthase gene and either of 
open reading frames located upstream and downstream of said gene. In said gene expression cassette, the open read- 
ing frame located upstream of the polyester synthase gene includes those (e.g. SEQ ID NO:3) containing DNA coding 
for the amino acid sequence of SEQ ID NO:4, and the open reading frame located downstream of the polyester syn- 
45 thase gene includes those (e.g. SEQ ID NO:5) containing DNA coding for a polypeptide containing the amino acid 
sequence of SEQ ID NO:6 or a sequence where in said amino acid sequence, one or more amino acids are deleted, 
replaced or added, said polypeptide bringing about enoyl-CoA hydratase activity. 

Even if one or more amino acids in the amino acid sequence of SEQ ID NO:2 have undergone mutations such as 
deletion, replacement, addition etc., DNA coding for a polypeptide containing said amino acid sequence is also con- 
so tained in the gene of the present invention insofar as the polypeptide has polyester synthase activity. For example, DNA 
coding for the amino acid sequence of SEQ ID NO:2 where methionine at the first position is deleted is also contained 
in the gene of the present invention. 

Further, the present invention is recombinant vectors comprising said polyester synthase gene or said gene 
expression cassette. 

55 Further, the present invention is transformants transformed with said recombinant vectors. 

Further, the present invention is processes for producing polyester, wherein said transformant is cultured in a 
medium, and polyester is recovered from the resulting culture. Examples of such polyester are copolymers (e.g. poly(3- 
hydroxybutyrate-co-3-hydroxyhexanoate) random copolymers) of 3-hydroxyaIkanoic acid represented by formula I: 
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wherein R represents a hydrogen atom or a C1 to C4 alkyl group. 

Brief Description of the Drawing 

FIG. 1 shows the structure of the gene of the present invention. 

FIG. 2 is a photograph showing the result of SDS-polyacryl amide gel electrophoresis. 

Detailed Description of the Invention 

Hereinafter, the present invention is described in detail. 

(1) Cloning of Polyester synthase gene 

The polyester synthase gene of the present invention is separated from a microorganism belonging to the genus 
Aeromonas . 

First, genomic DNA is isolated from a strain having the polyester synthase gene. Such a strain includes e.g. Aerom- 
onas caviae. 

Any known methods can be used for preparation of genomic DNA. For example, Aeromonas caviae is cultured in 
LB medium and then its genomic DNA is prepared by the hexadecyl trimethyl ammonium bromide method (Current Pro- 
tocols in Molecular Biology, vol. 1, page 2.4.3., John Wiley & Sons Inc., 1994). 

The DNA obtained in this manner is partially digested with a suitable restriction enzyme (e.g. Sau3AI, BamHI, Bglll 
etc.) and then the DNA fragments are then dephosphorylated by treatment with alkaline phosphatase. It is ligated into 
a vector previously cleaved with a restriction enzyme (e.g. BamHI, Bglll etc.) to prepare a library. 

Phage or plasmid capable of autonomously replicating in host microorganisms is used as the vector. The phage 
vector includes e.g. EMBL3, M13, Xgt11 etc., and the plasmid vector includes e.g. pBR322, pUC18, and pBluescript II 
(Stratagene). Vectors capable of autonomously replicating in 2 or more host cells such as E. eoti and Bacillus brevis. as 
well as various shuttle vectors, can also be used. Such vectors are also cleaved with said restriction enzymes so that 
their fragment can be obtained. 

Conventional DNA ligase is used to ligate the resulting DNA fragments into the vector fragment. The DNA frag- 
ments and the vector fragment are annealed and then ligated to produce a recombinant vector. 

To introduce the recombinant vector into a host microorganism, any known methods can be used. For example, if 
the host microorganism is E. con, the calcium method (Lederberg, E.M. et a!., J. Bacteriol. 119. 1072 (1974)) and the 
electroporation method (Current Protocols in Molecular Biology, vol. 1, page 1.8.4 (1994)) can be used. If phage DNA 
is used, the in vitro packaging method (Current Protocols in Molecular Biology, vol. 1,, page 5.7.1 (1994)) etc. can be 
adopted. In the present invention, an in vitro packaging kit (Gigapack II, produced by Stratagene etc.) can also be used. 

To obtain a DNA fragment containing the polyester synthase gene derived from Aeromonas caviae. a probe is then 
prepared. The amino acid sequences of some polyester synthase have already been known (Peoples, O.P. and Sins- 
key, A.J., J. Biol. Chem., 2§4. 15293 (1989); Huisman, G.W. et al.. J. Biol. Chem., 2191 (1991); Pieper, U. et a!., 
FEMS Microbiol. Lett.. 96, 73 (1992) etc.). Two conserved regions are selected from these amino acid sequences, and 
nucleotide sequences coding them are estimated to design oligonucleotides for use as primers. Examples of such oli- 
gonucleotides include, but are not limited to, the 2 oligonucleotides 5'-CC(C/G)CC(C/G)TGGATCAA(T/C)AAGT 
(T/A)(T/C)TA(T/C)ATC-3' (SEQ ID NO:7) and 5'-(G/C)AGCCA (G/C)GC(G/C)GTCCA(A/G)TC(G/C)GGCCACCA-3' 
(SEQIDNO:8). 

Polymerase chain reaction (PCR) (Molecular Cloning, vol. 2, page 14.2 (1989)) is carried out using these oligonu- 
cleotides as primers and the genomic DNA of Aeromonas caviae as a template. The partial fragment of polyester syn- 
thase gene is amplified by PCR. 

Then, the partially amplified fragment thus obtained is labeled with a suitable reagent and used for colony hybridi- 
zation of the above genomic DNA library (Current Protocols in Molecular Biology, vol. 1 , page 6.0.3 (1994)). 

The E. sail is screened by colony hybridization, and a plasmid is recovered from it using the alkaline method (Cur- 
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rent Protocols in Molecular Biology, vol. 1, page 1.6.1 (1994)), whereby a DNA fragment containing the polyester syn- 
thase gene is obtained. 

The nucleotide sequence of said DNA fragment can be determined in e.g. an automatic nucleotide sequence ana- 
lyzer such as 373A DNA sequencer (Applied Biosystems) using a known method such as the Sanger method (Molec- 
5 ular Cloning, vol. 2, page 13.3 (1989)). 

The nucleotide sequence of the polyester synthase gene of the present invention is shown in SEQ ID NO:1 . and 
the amino acid sequence encoded by said gene is shown in SEQ ID NO:2, where some amino acids may have under- 
gone mutations such as deletion, replacement, addition etc. insofar as a polypeptide having said amino acid sequence 
brings about polyester synthase activity. Further, the gene of the present invention encompasses not only the nucle- 
10 otide sequence coding for the amino acid sequence of SEQ ID NO:2 but also its degenerated isomers which except for 
degeneracy codons, code for the same polypeptide. 

The above mutations such as deletion etc. can be induced by known site-directed mutagenesis (Current Protocols 
in Molecular Biology, vol., 1, page 8.1.1 (1994)). 

After the nucleotide sequence was determined by the means described above, the gene of the present invention 
15 can be obtained by chemical synthesis or the PCR technique using genomic DNA as a template, or by hybridization 
using a DNA fragment having said nucleotide sequence as a probe. 

(2) Preparation of Transformant 

20 The transformant of the present invention is obtained by introducing the recombinant vector of the present invention 
into a host compatible with the expression vector used in constructing said recombinant vector. 

The host is not particularly limited insofar as it can express the target gene. Examples are bacteria such as micro- 
organisms belonging to the genus Alcaligenes . microorganisms belonging to the genus Pseudomonas. microorgan- 
isms belonging to the genus Bacillus , yeasts such as the genera Saccharomvces . Candida etc.. and animal cells such 

25 as COS cells. CHO cells etc. 

If bacteria such as microorganisms belonging to the genus Alcaliaenes. microorganisms belonging to the genus 
Pseudomonas etc. are used as the host, the recombinant DNA of the present invention is preferably constituted such 
that it contains a promoter, the DNA of the present invention, and a transcription termination sequence so as to be capa- 
ble of autonomous replication in the host. The expression vector includes pLA291 7 (ATCC 37355) containing replication 

30 origin RK2 and pJRD21 5 (ATCC 37533) containing replication origin RSF1010, which are replicated and maintained in 
a broad range of hosts. 

The promoter may be any one if it can be expressed in the host. Examples are promoters derived from E. goH, 
phage etc.. such as trp promoter, lac promoter, P L promoter. P R promoter and T7 promoter. The method of introducing 
the recombinant DNA into bacteria includes e.g. a method using calcium ions (Current Protocols in Molecular Biology, 
35 vol. 1. page 1.8.1 (1994)) and the electroporation method (Current Protocols in Molecular Biology, vol. 1, page 1.8.4 
(1994)). 

If yeast is used as the host, expression vectors such as YEp13 t YCp50 etc. are used. The promoter includes e.g. 
gaJ 1 promoter, gal 10 promoter etc. To method of introducing the recombinant DNA into yeast includes e.g. the elec- 
troporation method (Methods. Enzymol.. 194. 182-187 (1990)). the spheroplast method (Proc. Natl. Acad. Sci. USA. 84. 
40 1929-1933 (1978)). the lithium acetate method (J. Bacteriol.. 152. 163-168 (1983)) etc. 

If animal cells are used as the host, expression vectors such as pcDNAI. pcDNAI/Amp (produced by Invitrogene) 
etc. are used. The method of introducing the recombinant DNA into animal cells includes e.g. the electroporation 
method, potassium phosphate method etc. 

The nucleotide sequence determined as described above contains the polyester synthase gene as well as a plu- 
45 rality of open reading frames (ORFs) upstream and downstream of it. That is. the polyester synthase gene forms an 
operon with at least 2 ORF s under the control of a single promoter region. 

The ORF's which are located respectively upstream and downstream of the polyester synthase gene are referred 
to hereinafter as "ORF1 ■ and "ORF3". 

It is considered that ORF1 is an open reading frame of a gene involved in accumulating polyester in the microor- 
so ganism or a gene in the polyester biosynthesis system. It was revealed that ORF3 is an open reading frame of a gene 
coding for enoyl-CoA hydratase (particularly (R)-specific enoyl-CoA hydratase) involved in biosynthesis of polyester. 

As shown in FIG. 1, an EcoRI fragment carrying an expression regulatory region (expressed as n -35M0" in FIG. 
1A), the polyester synthase gene, ORF1, and ORF3 was cloned in the present invention (FIG. 1A). This fragment is 
designated EE32. 

55 Then, a fragment (a gene expression cassette) is prepared by deleting ORF1 and/or ORF3 from EE32, and this 
cassette is introduced into a host whereby a transformant capable of efficiently producing polyester can be obtained. 

In EE32, a restriction enzyme Bglll sites are introduced into regions between the expression regulatory region and 
the translation initiation codon of ORF1 and between the translation termination codon of ORF1 and the translation ini- 
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tiation codon of the polyester synthase gene, and then ORF1 is deleted from EE32 by treatment with Bglll (FIG. IB). 
Similarly, a restriction enzyme BamHI sites is introduced into a region between the translation termination codon of the 
polyester synthase gene and ORF3, and then ORF3 is deleted by treatment with BamHI (FIG. 1 C). 

To delete both ORF1 and ORF3, EE32 may be subjected to the above operation of deleting ORF1 and ORF3 (FIG 

5 1D). 

The restriction enzyme sites can be introduced by site-directed mutagenesis using synthetic oligonucleotides (Cur- 
rent Protocols in Molecular Biology, vol. 1 , page 8.1.1 (1994)). 

Each gene expression cassette thus obtained is inserted into said plasmid capable of expression (e.g. pJRD215 
(ATCC 37533)) and the resulting recombinant vector is used to transform Alcaligenes eutroohus PHB-4 (DSM541) 
70 (strain deficient in the ability to synthesize polyester). The method for this transformation includes e.g. the calcium chlo- 
ride method, rubidium chloride method, low pH method, in yjirfi packaging method, conjugation transfer method etc. 

(3) Production of Polyester 

is The production of polyester is carried out by culturing the transformant of the present invention in a medium, form- 
ing and accumulating the polyester of the present invention in the microorganism or in the culture, and recovering the 
polyester from the cultured microorganism or from the culture. 

A conventional method used for culturing the host is also used to culture the transformant of the present invention. 
The medium for the transformant prepared from a microorganism belonging to the genus Alcaligenes or Pseu- 

20 domonas as the host include a medium containing a carbon source assimilable by the microorganism, in which a nitro- 
gen source, inorganic salts or another organic nutrition source has been limited, for example a medium in which the 
nutrition source has been limited to 0.01 to 0.1 %. 

The carbon source is necessary for growth of the microorganism, and it is simultaneously a starting material of pol- 
yester. Examples are hydrocarbons such as glucose, fructose, sucrose, maltose etc. Further, fat and oil related sub- 

25 stances having 2 or more carbon atoms can be used as the carbon source. The fat and oil related substances include 
natural fats and oils, such as corn oil, soybean oil. safflower oil. sunflower oil. olive oil, coconut oil, palm oil. rape oil, fish 
oil. whale oil. porcine oil and cattle oil. aliphatic acids such as acetic acid, propionic acid, butanoic acid, pentanoic acid, 
hexoic acid, octanoic acid, decanoic acid, lauric acid, oleic acid, palmitic acid, linolenic acid, linolic acid and myristic 
acid as well as esters thereof, alcohols such as ethanol, propanol, butanol, pentanol. hexanol, octanol. lauryl alcohol, 

30 oleyl alcohol and palmityl alcohol as well as esters thereof. 

The nitrogen source includes e.g. ammonia, ammonium salts such as ammonium chloride, ammonium sulfate, 
ammonium phosphate etc., peptone, meat extract, yeast extract, corn steep liquor etc. The inorganic matter includes 
e.g. monopotassium phosphate, dipotassium phosphate, magnesium phosphate, magnesium sulfate, sodium chloride 
etc. 

35 Culture is carried out usually under aerobic conditions with shaking at 25 to 37 °C for more than 24 hours (e.g. 1 to 
7 days) after expression is induced. During culture, antibiotics such as ampicillin, kanamycin, arrtipyrine, tetracycline 
etc. may be added to the culture. Polyester is accumulated in the microorganism by culturing it, and the polyester is then 
recovered. 

To culture the microorganism transformed with the expression vector using an inducible promoter, its inducer can 
40 also be added to the medium. For example, isopropyl-p-D-thiogalactopyranoside (IPTG), indoleacrylic acid (IAA) etc. 
can be added to the medium. 

To culture the transformant from animal cells as the host, use is made of a medium such as RPMI-1640 or DMEM 
which may be supplemented with fetal bovine serum. Culture is carried out usually in the presence of 5 % C0 2 at 30 to 
37°C for 14 to 28 days. During culture, antibiotics such as kanamycin, penicillin etc. may be added to the medium. 
45 In the present invention, purification of polyester can be carried out e.g. as follows: 

The transformant is recovered from the culture by centrifugation, then washed with distilled water and dried. There- 
after, the dried transformant is suspended in chloroform and heated to extract polyester from it. The residues are 
removed by filtration. Methanol is added to this chloroform solution to precipitate polyester. After the supernatant is 
removed by filtration or centrifugation, the precipitates are dried to give purified polyester. 
so The resulting polyester is confirmed to be the desired one in a usual manner e.g. by gas chromatography, nuclear 
magnetic resonance etc. 

The gene of the present invention contains the polyester synthase gene isolated from Aeromonas caviae . This syn- 
thase can synthesize a copolymer (polyester) consisting of a monomer unit 3-hydroxyalkanoic acid represented by for- 
mula I: 
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wherein R represents a hydrogen atom or a C1 to C4 alkyl group. Said copolymer includes e.g. poly(3-hydroxybutyrate- 
co-3-hydroxyhexanoate) random copolymer (P(3HB-co-3HH)) etc. and the transformant carrying said polyester syn- 
thase gene has the ability to produce P(3HB-co-3HH) with very high efficiency. 

Conventionally, a process for producing poly-3-hydroxybutyrate (P(3HB)) or poly(3-hydroxybutyrate-co-3-hydroxy- 
valerate) random copolymer P(3HB-co-3HV) has been studied and developed, but such polyester has the problem in 
physical properties of inferior resistance to impact because it is a highly crystalline polymer. 

Because degree of crystallinity is lowered by introducing 3-hydroxyhexanoate having 6 carbon atoms into a poly- 
mer chain, polyester acts as a flexible polymeric material which is also excellent in thermostability and formability, but 
conventional processes for producing P(3HB-co-3HH) by use of Aeromonas caviae (Japanese Patent Laid Open Pub- 
lication Nos. 93049/1993 and 265065/1995) suffer from a low yield of polyester. 

In the present invention, the P(3HB-co-3HH) copolyester can be produced in high yield. 

Because the desired polyester can be obtained in a large amount using the above means, it can be used as a bio- 
degradable material of yarn or film, various vessels etc. Further, the gene of the present invention can be used to breed 
a strain highly producing the P(3HB-co-3HH) copolymer polyester. 

Examples 

Hereinafter, the present invention is described in more detail with reference to the Examples which however are not 
intended to limit the scope of the present invention. (Example 1] Cloning of the Polyester synthase Gene from Aerom- 
onas caviae 

First a genomic DNA library was prepared from Aeromonas caviae . 

Aeromonas caviae FA440 was cultured overnight in 100 ml LB medium (1 % yeast extract, 0.5 % trypton, 0.5 % 
sodium chloride, 0.1 % glucose, pH 7.5) at 30 °C and then genomic DNA was obtained from the microorganism using 
the hexadecyl trimethyl ammonium bromide method (Current Protocols in Molecular Biology, vol. 1, page 2.4.3 (1994), 
John Wiley & Sons Inc.). 

The resulting genomic DNA was partially digested with restriction enzyme Sau3Al. The vector piasmid used was 
cosmid vector pLA291 7 (ATCC 37355). 

This piasmid was cleaved with restriction enzyme Bglll and dephosphorylated (Molecular Cloning, vol. 1, page 
5.7.2 (1 989), Cold Spring Harbor Laboratory) and then ligated into the partially digested genomic DNA fragment by use 
of DNA ligase. 

E. son Si 7-1 was transformed with this ligated DNA fragment by the in vitro packaging method (Current Protocols 
in Molecular Biology, vol. 1 , page 5.7.2 (1 994)) whereby a genomic DNA library from Aeromonas caviae was obtained. 

To obtain a DNA fragment containing the polyester synthase gene from Aeromonas caviae. a probe was then pre- 
pared. Two well conserved regions were selected from known amino acid sequences of several polyester synthases, 
and nucleotide sequences coding for them were estimated, and 2 oligonucleotides 5'-CC(C/G)CC(C/G)TGGAT- 
CAA(T/C)AAGT (T/A)(T/C) TAfT/CJATC-S' (SEQ ID NO:7) and 5'- 

(G/C)AGCCA(G/C)GC(G/C)GTCCA(A/G)TC(G/C)GGCCACCA-3 , (SEQ ID NO:8) were synthesized. 

The polyester synthase gene was partially amplified by PCR using these oligonucleotides as primers and the 
genomic DNA from Aeromonas caviae as a template. PCR was carried out using 30 cycles, each consisting of reaction 
at 94 °C for 30 seconds, 50 °C for 30 seconds, and 72 °C for 60 seconds. 

Then, this partially amplified fragment was labeled with digoxigenin using a DIG DNA labeling kit (Boehringer Man- 
nheim) and used as a probe. 

Using the probe thus obtained. £. coli carrying a piasmid containing the polyester synthase gene was isolated by 
colony hybridization from the genomic DNA library from Aeromonas caviae . By recovering the piasmid from the £. coli. 
a DNA fragment containing the polyester synthase gene was obtained. 

The nucleotide sequence of a 3.2 kbp Bglll-EcoRI fragment from this fragment was determined by the Sanger 
method. 

As a result, the nucleotide sequence of the 3.2 kb fragment as shown in SEQ ID NOs:9 or 10 was determined. 
By further examining homology to this nucleotide sequence, the polyester synthase gene containing the nucleotide 
sequence (1785 bp) of SEQ ID NO:1 could be identified in this 3.2 kbp nucleotide sequence. 
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It should be understood that insofar as the protein encoded by the polyester synthase gene of the present invention 
has the function of gene expression for polyester polymerization, the nucleotide sequence of said gene may have 
undergone mutations such as deletion, replacement, addition etc. 

In a fragment having the nucleotide sequence of SEQ ID NO:9 or10, a 405 bp gene (ORF3) and a transcription 
5 termination region located downstream of the above 1785 bp nucleotide sequence, as well as a 354 bp gene (ORF1) 
and an expression regulatory region located upstream thereof were identified. The nucleotide sequence of ORF1 is 
shown in SEQ ID NO:4; the nucleotide sequence of ORF3 in SEQ ID NO:5; and the amino acid sequence encoded by 
ORF3 in SEQ ID NO: 6. 

ORF3 is an open reading frame of a gene coding for enoyl-CoA hydratase involved in biosynthesis of polyester. 
w Insofar as a polypeptide having the amino acid sequence encoded by ORF3 has enoyl-CoA hydratase activity, particu- 
larly (R)-specif ic enoyl-CoA hydratase activity, said amino acid sequence may have undergone mutations such as dele- 
tion, replacement and addition of one or more amino acids. 

In the nucleotide sequences of SEQ ID NOS:9 and 10, the expression regulatory region is located at the 1 - to 383- 
positions and the transcription termination region at the 3010 to 31 87- positions. 

75 

[Example 2] Preparation of Alcaliaenes eutrophus Transformant 

The Bglll site of the Bglll-EcoRI fragment containing this expression regulatory region. ORF1, the polyester syn- 
thase gene, ORF3, and the transcriptional termination region was made EcoRI-ended by use of an EcoRI linker 

20 whereby a 3.2 kb EcoRI-EcoRI fragment (EE32 fragment) was obtained. This fragment was inserted into plasmid 
pJRD215 (ATCC 37533) capable of expression in microorganisms belonging to the genus Alcaliaenes. and the result- 
ing recombinant plasmid was transformed into Alcaliaenes eutrophus PHB-4 (DSM 541) (strain deficient in the ability 
to synthesize polyester) by the conjugation transfer method, as follows: 

First, the recombinant plasmid was used to transform E.cgH S17-1 by the calcium chloride method. The recom- 

25 binant E.coli thus obtained and Alcalioenes eutrophus PHB-4 were cultured overnight in 1 .5 ml LB medium at 30 °C, 
and the respective cultures, each 0.1 ml, were combined and cultured at 30 °C for 4 hours. This microbial mixture was 
plated on MBF agar medium (0.9 % disodium phosphate. 0.15 % monopotassium phosphate. 0.05 % ammonium chlo- 
ride, 0.5 % fructose. 1 .5 % agar. 0.3 mg/ml kanamycin) and cultured at 30 °C for 5 days. 

Because Alcaliqenes eutrophus PHB-4 is rendered resistant to kanamycin by transferring the plasmid in the recom- 

20 binant E. coli into it, the colonies grown on the MBF agar medium are a transformant of Alcaliaenes eutrophus . One 
colony was isolated from these colonies so that Alcaliaenes eutrophus AC32 (referred to hereinafter as AC32) was 
obtained. 

AC32 has been deposited as FERM BP-6038 with the National Institute of Bioscience and Human-Technology, 
Agency of Industrial Science and Technology, Japan. 

35 A restriction enzyme Bglll sites were introduced respectively into regions upstream and downstream of the ORF1 
gene in the EE32 fragment by site-directed mutagenesis using a synthetic oligonucleotide (Current Protocols in Molec- 
ular Biology, vol. 1 , page 8.1.1 (1994)), and an ORF1 gene-free fragment was obtained by deleting the Bglli-Bglll frag- 
ment from the EE32 fragment and then inserted into plasmid PJRD215. The resulting recombinant plasmid was used 
to transform Alcaliqenes eutrophus PHB-4 by the conjugation transfer method described above. The resulting trans- 

40 formant is referred to hereinafter as AC321 . 

Similarly, a restriction enzyme BamHI sites were introduced respectively regions upstream and downstream of the 
ORF3 gene in the EE32 fragment by site-directed mutagenesis, and an ORF3 gene-free fragment was obtained by 
deleting the BamHI-BamHI fragment from the EE32 fragment and then inserted into plasmid pJRD215. The resulting 
recombinant plasmid was used to transform Alcalioenes eutrophus PHB-4 by the conjugation transfer method 

45 described above. The resulting transformant is referred to hereinafter as AC323. 

Similarly, a restriction enzyme Bglll sites were introduced respectively regions upstream and downstream of the 
ORF1 gene and a restriction enzyme BamHI sites were introduced respectively regions upstream and downstream of 
the ORF3 gene in the EE32 fragment, and a gene fragment free of both the ORF1 and ORF3 genes was obtained by 
deleting the Bglll-Bglll and BamHI-BamHI fragments from the EE32 fragment and then inserted into plasmid pJRD215. 

so The resulting recombinant plasmid was used to transform Alcaliaenes eutrophus PHB-4 by the conjugation transfer 
method described above. The resulting transformant is referred to hereinafter as AC3213. 

Further, the polyester synthase gene was amplified by PCR using the EE32 fragment as a template, and the result- 
ing amplification product was inserted into a region between an expression regulatory region and a transcription termi- 
nation region in a known polyester biosynthesis operon derived from Alcaliqenes eutrophus . PCR was carried out using 

55 5'-AGTTCCCGCCTCGGGTGTGGGTGAA-3' (SEQ ID NO: 11) and S'-GGCATATGCGCTCATGCGGCGTCCT-S' (SEQ 
ID NO: 12) as primers in 30 cycles each consisting of reaction at 94 °C for 30 seconds, 55 °C for 30 seconds and 72 °C 
for 60 seconds. 

This DNA fragment was inserted into plasmid pJRD215, and the resulting plasmid was used to transform Alcali- 
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genes eutrophus PHB-4 by the conjugation transfer method described above. The resulting transformant is referred to 
hereinafter as AC29. 

[Example 3] Synthesis of Polyester by Alcaliaenes eutrophus Transfer mants 

5 

Each of Alcaliaenes eutrophus H16, PHB-4, AC32, AC321, AC323, AC3213 and AC29 was inoculated into 95 ml 
MB medium (0.9 % disodium phosphate, 0.15% monopotassium phosphate, 0.05 % ammonium chloride) containing 1 
ml of 1 % sodium octanate and incubated in a flask at 30 °C. 0.2 g/L kanamycin was contained in the mediums for 
strains AC32, AC321 , AC323, AC3213 and AC29. 1 2, 24, 36 and 48 hours thereafter, 1 ml of 1 % sodium octanate was 
10 added to each medium (total amount of sodium octanate added: 0.5 g) and the microorganisms were cultured for 72 
hours. 

Each of strains H16 and AC3213 was inoculated into the above MB medium to which 1% olive oil, palm oil, corn oil 
or oleic acid had been added, and each strain was cultured at 30 °C for 72 hours in a flask. 0.2 g/L kanamycin was con- 
tained in the mediums for strain AC321 3. 

is Each of strains H16, AC32, AC321, AC323 and AC3213 was inoculated into the above MB medium to which 1% 
sodium heptanoate had been added, and each strain was cultured at 30 °C in a flask. 0.2 g/L kanamycin was contained 
in the mediums for strains AC32, AC321 , AC323 and AC3213. 

While 1 ml of 1% sodium heptanoate was added to each medium (total amount of sodium heptanoate added: 0.5 
g) 12, 24, 36 and 48 hours thereafter, the microorganisms were cultured for 72 hours. 444 

20 The microorganisms were recovered by centrifugation, washed with distilled water and lyophilized, and the weight 
of the dried microorganisms was determined. 2 ml sulfuric acid/methanol mixture (15 : 85) and 2 ml chloroform were 
added to 10-30 mg of the dried microorganism, and the sample was sealed and heated at 100 °C for 140 minutes 
whereby the polyester in the microorganisms was decomposed into methylester. 1 ml distilled water was added thereto 
and stirred vigorously. It was left and separated into 2 layers, and the lower organic layer was removed and analyzed 

25 for its components by capillary gas chromatography through a capillary column Neutra BOND-1 (column of 25 m in 
length. 0.25 mm in inner diameter and 0.4 jim in liquid film thickness, manufactured by GL Science) in Shimadzu GC- 
1 4A. The temperature was raised at a rate of 8 °C/min. from an initial temperature of 100 °C. The results are shown in 
Tables 1 , 2 and 3. 

30 

Table 1 



Synthesis of Polyester Using Octanoic Acid as Carbon Source 


Strain Used A. eutroohus 


Weight of Dried Microor- 
ganism (g/l) 


Content of Polyester in 
Dried Microorganism 
(weight-%) 


Polyester Comp. 








3HB 


3HH 








(mole-%) 


H16 


3.00 


86 


100 


0 


PHB-4 


0.80 


0 






AC32 


0.99 


33 


78 


23 


AC321 


2.85 


92 


87 


13 


AC323 


2.85 


92 


88 


12 


AC3213 


3.64 


96 


85 


15 


AC29 


3.20 


94 


92 


8 


3HB: 3-hydroxybutyrate, 3HH: 3-hydroxyhexanoate ! 
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Table 2 



Synthesis of Polyester Using Vegetable Oil or Oleic Acid as Carbon Source 


Strain Used A 
eutroohus 


vydiuun oource 


vveigni ot uneo 
Microorganism (g/l) 


Content of Polyester in 
Dried Microorganism 
(weight-%) 


Polyester Comp. 










3HB 


3HH 










(mol 


e-%) 


H16 


olive oil 


4.27 


79 


100 


0 




corn oil 


3.57 


81 


100 


0 




palm oil 


4.13 


79 


100 


0 




oleic acid 


4.06 


82 


100 


0 


AC3213 


olive oil 


3.54 


76 


96 


4 




corn oil 


3.60 


77 


95 


5 j 




palm oil 


3.58 


81 


96 


4 




oleic acid 


2.22 


70 


96 


4 ! 


3HB: 3-hydroxybutyrate, 3HH: 3-hydroxyhexanoate 





Table 3 



30 


Synthesis of Polyester Using Heptanoic Acid as Carbon Source 




Strain Used A. eutroohus 


Weight of Dried Microor- 


Content of Polyester in 


Polyester Comp. 






ganism (g/l) 


Dried Microorganism 














(weight-%) 








35 








3HB 


3HV 


3HHp 










(mol e-%) 




H16 


2.50 


60 


50 


50 


0 


40 


AC32 


0.77 


7 


30 


67 


5 




AC321 


1.67 


55 


46 


52 


2 




AC323 


1.27 


40 


48 


45 


7 




AC3213 


2.76 


67 


44 


48 


8 


45 


3HB: 3-hydroxybutyrate, 3HH: 3-hydroxyhexanoate, 3HHp: 3-hydroxyheptanoate 



As shown in Table 1, H16 (i.e. wild-type Alcaliaenes eutrophus) synthesized a poly(3-hydroxybutyrate) homopoly- 
mer. This is because 3HH (3-hydroxyhexanoate) having 6 carbon atoms does not serve as a substrate for the polyester 

so synthase possessed by H16. PHB-4 (i.e. the same strain as H16 but deficient in the ability to synthesize polyester) lacks 
the polyester synthase and thus does not accumulate polyester. AC32 prepared by introducing into PHB-4 the EE32 
fragment containing the polyester synthase gene derived from Aeromonas caviae accumulated the poly(3-hydroxybur- 
ylate-co-3-hydroxyhexanoate) random copolymer (P(HB-co-3HH)) containing 22 mole-% 3HH (3-hydroxyhexanoate), 
and this copolymer accounted for 33 % by weight of the dried microorganism. 

55 AC321, AC323 and AC3213 accumulated P(3HB-co-3HH) containing 12 to 15 mole-% 3HH. and the copolymer 
accounted for 92 to 96 % by weight of the dried microorganisms. As can be seen from these results, the ability of these 
strains to accumulate polyester was significantly improved by deleting the ORF1 gene and/or ORF3 gene. 

P(3HB-co-3HH) was also accumulated in an amount of 94 % by weight of the microorganism even in the case of 
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AC29 carrying the polyester synthase gene derived from A, caviae whose expression regulatory region and transcrip- 
tional termination region had been replaced by those derived from Alcatiaenes eutrophus. indicating that the yield of 
polyester was significantly improved even using the expression regulatory region and transcriptional termination region 
of different origin. 

When AC3213 producing polyester in the highest yield was cultured using olive oil, corn oil or palm oil as a carbon 
source, the microorganism accumulated P(3HB-co-3HH) containing 4 to 5 mole-% 3HH. where the copolymer 
accounted for 76 to 81 % by weight of the microorganism, as shown in Table 2. Even if oleic acid as an fatty acid com- 
ponent contained most abundantly in vegetable oils was used as a carbon source, AC3213 accumulated P(3HB-co- 
3HH) containing 4 mole-% 3HH, where the copolymer accounted for 70 % by weight of the microorganism. Its corre- 
sponding wild strain H16 synthesized only poly(3-hydroxybutyrate) homopolymer under the same conditions. 

Alcalig gn eg eutrophus FA440 is reported to have accumulated 8 % by weight of P(3HB-co-3HH) by use of palmitic 
acid as a carbon source (Japanese Patent Laid Open Publication No. 265065/1995). On the other hand, the transform- 
ant according to the present invention has accumulated 96 % by weight of P(3HB-co-3HH) by use of octanoic acid as 
a carbon source and 76 to 81 % by weight of P(3HB-co-3HH) by use of extremely cheap vegetable oils as a carbon 
source, so the comparison therebetween indicates that the method of synthesizing P(3HB-co-3HH) by the transformant 
used in the present example is an extremely superior method. 

When heptanoic acid was used as a carbon source, H16, that is a wild strain of Alcaliqenes eutrophus. synthesized 
poly(3-hydroxybutyrate-co-3-hydroxyvalerate) copolymer (P(3HB-co-3HV)). This is because 3HHp (3-hydroxyhep- 
tanoate) having 7 carbon atoms does not serve as a substrate for the polyester synthase possessed by H16, AC32, 
derived from PHB-4 by introduction of the EE32 fragment containing the polyester synthase gene derived from Aerom- 
onas caviae . accumulated poly(3-hydroxybutyrate-co-3-hydroxyvalerate-co-3 -hydroxy heptanoate) terpolymer (P(3HB- 
co-3HV-co-3HHp)) containing 5 mole-% 3HHp, where this copolymer accounted for 7 % by weight of the dried microor- 
ganism. 

Further, each of strains AC321 , AC323 and AC3213 accumulated P(3HB-co-3HV-co-3HHp) containing 2 to 8 mole- 
% 3HHp where the copolymer accounted for 40 to 67 % by weight of the microorganisms, indicating that the yield of 
polyester was significantly improved by deleting the ORF1 gene and/or ORF3 gene (Table 3). 

From these results, it is concluded that copolyesters consisting of 3-hydroxyalkanoic acid with 4 to 7 carbon atoms 
can be synthesized using the polyester synthase derived from Aeromonas caviae . 

[Example 4] Identification of Functions of ORF3 

The ORF3 gene was amplified by PCR using the EE32 fragment as a template and then inserted into a site down- 
stream of T7 promoter in expression plasmid PET-3a (Novagene). PCR was carried out using 5'-G CC ATATGAGCG- 
CACAATCCCTGGAAGTAG-3* (SEQ ID NO:13) and 5'-CTGGGATCCGCCGGTGCTTAAGGCAGCTTG-3' (SEQ ID 
NO:14) as primers in 25 cycles each consisting of reaction at 95 °C for 60 seconds and 68 °C for 30 seconds. The 
resulting plasmid was used to transform E. coN BL21 (DE3) (Novagene). The resulting transformant is designated NB3. 

NB3 was cultured in LB medium at 30 °C for 4 hours, and isopropyl-p-D-thiogalactopyranoside (IPTG) was added 
at a final concentration of 0.4 mM to induce expression, and it was further cultured at 30 °C for 2 hours. The microor- 
ganism was recovered by centrrfugation, disrupted by ultrasonication and centrifuged to give a soluble protein fraction. 

As shown in Table 4, high enoyl-CoA hydratase activity was detected in the soluble fraction from the microorganism 
having the expression plasmid introduced into it. 

Table 4 

Specific Activity of Enoyl-CoA Hydratase 
in Soluble Protein Fraction 

(unit/mg protein) 

E. coli BL21/PET-3a 0 

E, coli NB3 1700 



The enoyl-CoA hydratase activity was determined by measuring a change in absorbance (263 nm) due to double 
bond hydration, using crotonyl-CoA (Sigma) as substrate (concentration: 0.25 mM). No activity was detected in E. soli 



10 



EP0 824 148A2 



into which the control plasrhid PET-3a free of the ORF3 gene had been introduced. 

Then, the enoyl-CoA hydratase protein was purified. A soluble protein fraction from NB3 was applied to an anion 
exchange column Q-Sepharose (Pharmacia) and eluted with a gradient of (0 to 1 M) NaCI, and a fraction with enoyl- 
CoA hydratase activity was collected. SDS-PAGE analysis indicated that the active fraction was homogenous in elec- 
trophoresis as shown in FIG. 2. In addition, about 3-fold specific activity could be attained as shown in Table 5. 

Table 5 

Specific Activity of Enoyl-CoA Hydratase 

— . Lunit/mg protein) 

E* coli NB3 soluble protein fraction 1700 

anion e xchange column elution fraction 5100 



The N-terminal amino acid sequence of the enoyl-CoA hydratase protein thus purified was determined. As shown 
in Table 6. the determined amino acid sequence was the same except for Met in the initiation codon as the amino acid 
sequence deduced from the nucleotide sequence of the ORF3 gene. 

Table 6 

Comparison between Amino Acid Sequences 

(unit /ma protein) 

N-terminal amino acid sequence of 

purified enoyl-CoA hydratase: SAQSLEVGQKARLSKRFGAA (SEQ ID NO: 15) 
amino acid sequence deduced from 

OPF3 nucleotide sequence: MSAQSLEVGOKARLSKRFGAA (SEP ID NO; 16) 



From this, it could be confirmed that ORF3 codes for enoyl-CoA hydratase. It is considered that Met was released 
by post-translational modification. 

Further, the stereospecificity of enoyl-CoA hydratase encoded by ORF3 was examined as follows: 

By adding (S)-3-hydroxybutyryl-CoA dehydrogenase (Sigma) (final concentration: 0.2 unit/mi) and oxidized nicoti- 
namide adenine dinucleotide (NAD+) (final concentration: 0.5 mM) to a reaction solution for activity measurement, (S)- 
3-hydroxybutyryl-CoA formed is oxidized to acetoacetyl-CoA by the action of the dehydrogenase if the enoyl-CoA 
hydratase is specific to the (S)-isomer. During this reaction, NAD+ is reduced to form NADH resulting in the generation 
of a specific absorption at 340 nm. If enoyl-CoA hydratase is specific to the (R)-isomer, NADH is not formed. 

As shown in Table 7, the change in absorbance at 340 nm when enoyl-CoA hydratase encoded by ORF3 was used, 
was the same as in the case where enoyl-CoA hydratase was absent, but if commercially available (S)-specific enoyl- 
CoA hydratase (Sigma) was used, a change in absorbance due to formation of NADH was observed. 
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Table 7 



Change in Absorbance at 340 nm after 1 Min. 


no addition of enoyl-CoA hydratase 




0.045 


ORF3-derived enoyl-CoA hydratase 




0.047 


(S)-isomer specific enoyl-CoA hydratase (Sigma) 




0.146 



From this result, it was made evident that the purified enoyl-CoA hydratase is specific to the (R)-isomer. Thus, it 
was found that ORF3 codes for (R)-isomer specific enoyl-CoA hydratase. 

According to the present invention, there are provided a polyester synthase, a recombinant vector carrying the 
gene, a transformant carrying the recombinant vector and a process for producing polyester by use of the transformant. 
is The present invention is extremely useful in that the present gene codes for a polyester synthase capable of syn- 
thesizing polyester as a copolymer consisting of a monomer unit represented by 3-hydroxyalkanoic acid having 4 to 7 
carbon atoms, and that the present process can synthesize a biodegradable plastic P(3HB-co-3HH) very efficiently 
which is excellent in thermostability and formability. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: 

(A) NAME: THE INSTITUTE OF PHYSICAL AND CHEMICAL RESEARCH 

(B) STREET: Hirosawa 2-1 

(C) CITY: Wako-shi 

(D) STATE : Saitama 
10 (E) COUNTRY: Japan 

(F) POSTAL CODE (ZIP) : 351-01 

(G) TELEPHONE: 81-48-467-9263 

(H) TELEFAX: 81-48-462-4609 

15 (ii) TITLE OF INVENTION: POLYESTER SYNTHASE GENE AND PROCESS FOR PRODUCING 

POLYESTER 

(iii) NUMBER OF SEQUENCES: 16 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
20 (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(V) CURRENT APPLICATION DATA: 

APPLICATION NUMBER: 97113932.4 

25 

<vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: JP 214509/1996 

(B) FILING DATE: 14 -AUG- 1996 

(vi) PRIOR APPLICATION DATA: 
30 (A) APPLICATION NUMBER: JP 199979/1997 

(B) FILING DATE: 25 - JUL - 1997 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 1785 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

40 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1 . .1782 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

48 

gQ An j. vx/ii~ aa(j V»XV9 wxv» v«i~v~ niu ul-u ftftu vjww no onu vun. vjv,^ 96 

Asn Asp Lys Leu Leu Ala Met Ala Lys Ala Gin Thr Glu Arg Thr Ala 
20 25 30 

144 



55 



ATG 


AGC 


CAA 


CCA 


TCT 


TAT 


GGC 


CCG 


CTG 


TTC 


GAG 


GCC 


CTG 


GCC 


CAC 


TAC 


Met 


Ser 


Gin 


Pro 


Ser 


Tyr 


Gly 


Pro 


Leu 


Phe 


Glu 


Ala 


Leu 


Ala 


His 


Tyr 


1 








5 










10 










15 




AAT 


GAC 


AAG 


CTG 


CTG 


GCC 


ATG 


GCC 


AAG 


GCC 


CAG 


ACA 


GAG 


CGC 


ACC 


GCC 


Asn 


Asp 


Lys 


Leu 


Leu 


Ala 


Met 


Ala 


Lys 


Ala 


Gin 


Thr 


Glu 


Arg 


Thr 


Ala 








20 










25 










30 






CAG 


GCG 


CTG 


CTG 


CAG 


ACC 


AAT 


CTG 


GAC 


GAT 


CTG 


GGC 


CAG 


GTG 


CTG 


GAG 


Gin 


Ala 


Leu 


Leu 


Gin 


Thr 


Asn 


Leu 


ASp 


ASp 


Leu 


Gly Gin 


Val 


Leu 


Glu 
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35 40 45 

CAG GGC AGC CAG CAA CCC TGG CAG CTG ATC GAG GCC GAG ATG AAC TGG 19 2 

Gin Gly Ser Gin Gin Pro Trp Gin Leu lie Gin Ala Gin Met Asn Trp 

5 50 55 60 

TGG CAG GAT CAG CTC AAG CTG ATG CAG CAC ACC CTG CTC AAA AGC GCA 24 0 

Trp Gin Asp Gin Leu Lys Leu Met Gin His Thr Leu Leu Lys Ser Ala 

65 70 75 80 

GGC CAG CCG AGC GAG CCG GTG ATC ACC CCG GAG CGC AGC GAT CGC CGC 2 88 

Gly Gin Pro 'Ser Glu Pro Val lie Thr Pro Glu Arg Ser Asp Arg Arg 

85 90 95 

10 TTC AAG GCC GAG GCC TGG AGC GAA CAA CCC ATC TAT GAC TAC CTC AAG 336 

Phe Lys Ala Glu Ala Trp Ser Glu Gin Pro lie Tyr Asp Tyr Leu Lys 

100 105 110 

CAG TCC TAC CTG CTC ACC GCC AGG CAC CTG CTG GCC TCG GTG GAT GCC 384 

Gin Ser Tyr Leu Leu Thr Ala Arg His Leu Leu Ala Ser val Asp Ala 

115 120 125 

75 CTG GAG GGC GTC CCC CAG AAG AGC CGG GAG CGG CTG CGT TTC TTC ACC 4 32 

Leu Glu Gly Val Pro Gin Lys Ser Arg Glu Arg Leu Arg Phe Phe Thr 

130 135 140 

CGC CAG TAC GTC AAC GCC ATG GCC CCC AGC AAC TTC CTG GCC ACC AAC 4 80 
Arg Gin Tyr Val Asn Ala Met Ala Pro Ser Asn Phe Leu Ala Thr Asn 
145 " 150 155 160 

CCC GAG CTG CTC AAG CTG ACC CTG GAG TCC GAC GGC CAG AAC CTG GTG 528 
Pro Glu Leu Leu Lys Leu Thr Leu Glu Ser Asp Gly Gin Asn Leu val 

165 170 175 

CGC GGA CTG GCC CTC TTG GCC GAG GAT CTG GAG CGC AGC GCC GAT CAG 576 
Arg Gly Leu Ala Leu Leu Ala Glu Asp Leu Glu Arg Ser Ala Asp Gin 

180 185 190 

CTC AAC ATC CGC CTG ACC GAC GAA TCC GCC TTC GAG CTC GGG CGG GAT 624 
25 Leu Asn He Arg Leu Thr Asp Glu Ser Ala Phe Glu Leu Gly Arg Asp 

195 200 205 

CTG GCC CTG ACC CCG GGC CGG GTG GTG CAG CGC ACC GAG CTC TAT GAG 67 2 
Leu Ala Leu Thr Pro Gly Arg Val Val Gin Arg Thr Glu Leu Tyr Glu 

210 215 220 

CTC ATT CAG TAC AGC CCG ACT ACC GAG ACG GTG GGC AAG ACA CCT GTG 72 0 
Leu He Gin Tyr Ser Pro Thr Thr Glu Thr Val Gly Lys Thr Pro val 
225 230 235 240 

CTG ATA GTG CCG CCC TTC ATC AAC AAG TAC TAC ATC ATG GAC ATG CGG 768 
Leu He Val Pro Pro Phe He Asn Lys Tyr Tyr He Met Asp Met Arg 

245 250 255 

CCC CAG AAC TCC CTG GTC GCC TGG CTG GTC GCC CAG GGC CAG ACG GTA 916 
Pro Gin Asn Ser Leu Val Ala Trp Leu Val Ala Gin Gly Gin Thr Val 
35 260 265 270 

TTC ATG ATC TCC TGG CGC AAC CCG GGC GTG GCC CAG GCC CAA ATC GAT 864 
Phe Met He Ser Trp Arg Asn Pro Gly Val Ala Gin Ala Gin He Asp 

275 280 285 

CTC GAC GAC TAC GTG GTG GAT GGC GTC ATC GCC GCC CTG GAC GGC GTG 912 
Leu Asp Asp Tyr Val val Asp Gly Val He Ala Ala Leu Asp Gly Val 
40 2 9 0 2 9 5 3 0 0 

GAG GCG GCC ACC GGC GAG CGG GAG GTG CAC GGC ATC GGC TAC TGC ATC 960 

Glu Ala Ala Thr Gly Glu Arg Glu Val His Gly He Gly Tyr Cys He 

305 310 315 320 

GGC GGC ACC GCC CTG TCG CTC GCC ATG GGC TGG CTG GCG GCG CGG CGC 10 08 

Gly Gly Thr Ala Leu Ser Leu Ala Met Gly Trp Leu Ala Ala Arg Arg 

325 330 335 

CAG AAG CAG CGG GTG CGC ACC GCC ACC CTG TTC ACT ACC CTG CTG GAC 1056 
Gin Lys Gin Arg Val Arg Thr Ala Thr Leu Phe Thr Thr Leu Leu Asp 

340 345 350 

TTC TCC CAG CCC GGG GAG CTT GGC ATC TTC ATC CAC GAG CCC ATC ATA 1104 
Phe Ser Gin Pro Gly Glu Leu Gly He Phe He His Glu Pro He He 
355 360 365 

50 GCG GCG CTC GAG GCG CAA AAT GAG GCC AAG GGC ATC ATG GAC GGG CGC 1152 

Ala Ala Leu Glu Ala Gin Asn Glu Ala Lys Gly He Met Asp Gly Arg 

370 375 380 

CAG CTG GCG GTC TCC TTC AGC CTG CTG CGG GAG AAC AGC CTC TAC TGG 12 00 

55 



30 



45 
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5 



75 



20 



30 



45 



Gin 


Leu 


Ala 


Val 


Ser 


Phe 


Ser 


Leu 


Leu 


Arg 


Glu 


Asn 


Ser 


Leu 


Tyr 


Trp 




385 










390 










395 










400 




AAC 


TAC 


TAC 


ATC 


GAC 


AGC 


TAC 


CTC 


AAG 


GGT 


CAG 


AGC 


CCG 


GTG 


GCC 


TTC 


1248 


Asn 


Tvr 


Tvr 


lie 


Asp 


Ser 


Tyr 


Leu 


Lys 


Gly 


Gin 


Ser 


Pro 


Val 


Ala 


Phe^ 












405 










410 










415 






GAT 


CTG 


CTG 


CAC 


TGG 


AAC 


AGC 


GAC 


AGC 


ACC 


AAT 


GTG 


GCG 


GGC 


AAG 


ACC 


1296 


Asp 


Leu 


Leu 


His 


Tm 
* *• V 


Asn 


Ser 


Asp 


Ser 


Thr 


Asn 


val 


Aia 


Glv 


Lys 


Thr 










420 










425 










430 








CAC 


AAC 


AGC 


PTP 


CTG 


CGC 


CGT 


CTC 


TAC 


CTG 


GAG 


AAC 


CAG 


CTG 


GTG 


AAG 


1344 


His 


Asn 


Sex 


Leu 


Leu 


Arg 


a-l y 


Leu 




Leu 


Glu 


Asn 


Gin 


Leu 


Val 


Lys 








435 










440 










445 










GGG 


GAG 


CTC 


AAG 


ATC 


CGC 


AAC 


ACC 


CGC 


ATC 


GAT 


CTC 


GGC 


AAG 


GTG 


AAG 


1392 


Gly 


Gl u 


Leu 


Lys 


He 


Arg 




Thr 


Arg 


lie 


Asp 


Leu 


Glv 


Lys 


Val 


Lys 






D u 










455 










460 












ACC 


ll i 


ulu 


CTG 


CTG 


GTG 


TCG 


GCG 


GTG 


GAC 


GAT 


CAC 


ATC 


GCC 


CTC 


TGG 


1440 


Thr 






Leu 


Leu 


Val 


Ser 


Ala 


Val 


Asp 


Asp 


His 


He 


Ala 


Leu 


Trn 

* Mr 




4 6 5 










47 0 










475 










480 




pap 




ALL 




GAG 


GGC 


ATG 


AAG 


CTG 


TTT 


GGC 


GGG 


GAG 


CAG 


CGC 


TTC 


1488 


LI 11 


PI \r 




Trp 


Gin 




Met 


Lys 


Leu 


Phe 


Gl v 


Glv 
Lxy 


Glu 


Gin 


Arg 


Phe 












485 










490 










495 






CTC 


CTG 


GCG 


GAG 


•ppp 


ppp 

wL 


par 


ATP 
AIL 


ppp 

LLL 


ppp 

LLL 


ATP 


ATC 


AAC 


CCG 


CCG 


GCC 


1536 


Leu 


Leu 


Al a 


ulu 




PI V 


His 


He 


Ala 


PI V 


He 


He 


Asn 


Pro 


Pro 


Ala 




















505 










510 










AAL 


AAL 




cine 


lit 


TPP 
1 LL 


PAP 
LAL 


AAC 


GGG 


GCC 


GAG 


GCC 


GAG 


AGC 


CCG 


1584 


Aid 


Asn 


Lys 


Tyr 


uiy 


Phe 


Trp 


His 


Asn 


Gly 


Ala 


Glu 


Ala 


Glu 


Ser 


Pro 








tic 
jIj 










520 










525 










L\AVJ 






PTP 
Liu 


PPA 


ppp 


ppp 


APP 


CAC 


CAG 


GGC 


GGC 


TCC 


TGG 


TGG 


CCC 


1632 






Trp 


Leu 


nla 


Gly 


Ala 


Thr 


His 


Gin 


PI \r 




Ser 


Tm 


Trp 








e -j ft 










D J D 










540 












pap 


Al lj 


ATP 
Alu 


ppp 


TTT 


ATC 


CAG 


AAC 


CGT 


GAC 


GAA 


GGG 


TCA 


GAG 


CCC 


GTC 


1680 


Glu 


cut; u 


net 


PI vr 
uri y 


Phe 


He 


Gin 


Asn 


Ara 
Ai.y 


Asp 


Glu 


Glv 


Ser 


Glu 


Pro 


Val 




545 










550 










555 










560 




ccc 


riff* 
lll 




GTC 


CCG 


GAG 


GAA 


GGG 


CTG 


GCC 


CCC 


GCC 


CCC 


GGC 


CAC 


TAT 


1728 


Pro 


Ala 


Arg 


val 


Pro 


Glu 


Glu 


Gly 


Leu 


Ala 


Pro 


Ala 


Pro 


Gly 


His 


Tyr 












565 










570 










575 






GTC 


AAG 


GTG 


CGG 


CTC 


AAC 


CCC 


GTG 


TTT 


GCC 


TGC 


CCA 


ACA 


GAG 


GAG 


GAC 


1776 


val 


Lys 


Val 


Arg 


Leu 


Asn 


Pro 


Val 


Phe 


Ala 


Cys 


Pro 


Thr 


Glu 


Glu 


Asp 










580 










585 










590 










GCA 


TGA 




























1785 


Al a. 


Ala 
































(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 2: 




















(i) 


SEQUENCE CHARACTERISTICS: 






















(A) LENGTH: 594 amino 


acids 




















(B) TYPE: 


amino acid 
























(C) STRAND EDNESS : 


























(D) TOPOLOGY: 


linear 






















(ii) 


MOLECULE TYPE: 


protein 






















Ui) 


SEQUENCE DESCRIPTION: SEQ ID NO; 2: 
















Ser 


Gin 


Pro 


Ser 


Tyr 


Gly 


Pro 


Leu 


Phe 


Glu 


Ala 


Leu 


Ala 


His 


Tvr 

i. y i 












5 










10 










15 






Asn 


ASp 


Lys 


Leu 


Leu 


Ala 


Met 


Ala 


Lys 


Ala 


Gin 


Thr 


Glu 


Arg 


Thr 


Ala 










20 










25 










30 








Gin 


Ala 


Leu 


Leu 


Gin 


Thr 


Asn 


Leu 


Asp 


Asp 


Leu 


Gly 


Gin 


Val 


Leu 


Glu 








35 










40 










45 










Gin 


Gly 


Ser 


Gin 


Gin 


Pro 


Trp 


Gin 


Leu 


He 


Gin 


Ala 


Gin 


Met 


Asn 


Trp 






50 










55 










60 












Trp 


Gin 


Asp 


Gin 


Leu 


Lys 


Leu 


Met 


Gin 


His 


Thr 


Leu 


Leu 


Lys 


Ser 


Ala 




65 










70 










75 










80 





55 
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Gly Gin Pro Ser 

Phe Lys Ala Glu 
100 

Gin Ser Tyr Leu 
115 

Leu Glu Gly Val 
130 

Arg Gin Tyr Val 
145 

Pro Glu Leu Leu 

Arg Gly Leu Ala 
180 

Leu Asn lie Arg 
195 

Leu Ala Leu Thr 
210 

Leu lie Gin Tyr 
225 

Leu lie Val Pro 

Pro Gin Asn Ser 
260 

Phe Met lie Ser 
275 

Leu Asp Asp Tyr 
290 

Glu Ala Ala Thr 
305 

Gly Gly Thr Ala 

Gin Lys Gin Arg 
340 

Phe Ser Gin Pro 
355 

Ala Ala Leu Glu 
370 

Gin Leu Ala Val 
385 

Asn Tyr Tyr lie 

Asp Leu Leu His 
420 

His Asn Ser Leu 
435 

Gly Glu Leu Lys 
450 

Thr Pro Val Leu 
465 

Gin Gly Thr Trp 

Leu Leu Ala Glu 
500 

Ala Asn Lys Tyr 
515 

Glu Ser Trp Leu 
530 

Glu Met Met Gly 
545 

Pro Ala Arg Val 

Val Lys Val Arg 
580 

Ala Ala 



Glu Pro val lie 
85 

Ala Trp Ser Glu 

Leu Thr Ala Arg 
120 

Pro Gin Lys Ser 
135 

Asn Ala Met Ala 
150 

Lys Leu Thr Leu 
165 

Leu Leu Ala Glu 

Leu Thr Asp Glu 
200 

Pro Gly Arg val 
215 

Ser Pro Thr Thr 
230 

Pro Phe lie Asn 
245 

Leu Val Ala Trp 

Trp Arg Asn Pro 
280 

Val val Asp Gly 
295 

Gly Glu Arg Glu 
310 

Leu Ser Leu Ala 
325 

Val Arg Thr Ala 

Gly Glu Leu Gly 
360 

Ala Gin Asn Glu 
375 

Ser Phe Ser Leu 
390 

Asp Ser Tyr Leu 
405 

Trp Asn Ser Asp 

Leu Arg Arg Leu 
440 

lie Arg Asn Thr 
455 

Leu val Ser Ala 
470 

Gin Gly Met Lys 
485 

Ser Gly His He 

Gly Phe Trp His 
520 

Ala Gly Ala Thr 
535 

Phe lie Gin Asn 
550 

Pro Glu Glu Gly 
565 

Leu Asn Pro Val 



Thr 


Pro 


Glu 


Arg 




90 






Gin 


Pro 


He 


Tyr 


105 








His 


Leu 


Leu 


Ala 


Arg 


Glu 


Arg 


Leu 








140 


Pro 


Ser 


Asn 


Phe 






155 




Glu 


Ser 


Asp 


Gly 




170 






Asp 


Leu 


Glu 


Arg 


185 








Ser 


Ala 


Phe 


Glu 


val 


Gin 


Arg 


Thr 








220 


Glu 


Thr 


val 


Gly 






235 




Lys 


Tyr 


Tyr 


He 




250 






Leu 


val 


Ala 


Gin 


265 








Gly 


Val 


Ala 


Gin 


val 


lie 


Ala 


Ala 








300 


Val 


His 


Gly 


He 






315 




Met 


Gly 


Trp 


Leu 




330 






Thr 


Leu 


Phe 


Thr 


345 








He 


Phe 


He 


His 


Ala 


Lys 


Gly 


He 








380 


Leu 


Arg 


Glu 


Asn 






395 




Lys 


Gly 


Gin 


Ser 




410 






Ser 


Thr 


Asn 


Val 


425 








Tyr 


Leu 


Glu 


Asn 


Arg 


He 


Asp 


Leu 








460 


val 


Asp 


Asp 


His 






475 




Leu 


Phe 


Gly 


Gly 




490 






Ala 


Gly 


He 


He 


505 








Asn 


Gly 


Ala 


Glu 


His 


Gin 


Gly 


Gly 








540 


Arg 


Asp 


Glu 


Gly 






555 




Leu 


Ala 


Pro 


Ala 




570 






Phe 


Ala 


Cys 


Pro 



585 



Ser 


Asp 


Arg 


Arg 






95 




Asp 


Tyr 


Leu 


Lys 




110 






Ser 


val 


Asp 


Ala 


125 








Arg 


Phe 


Phe 


Thr 


Leu 


Ala 


Thr 


Asn 








160 


Gin 


Asn 


Leu 


val 






175 




Ser 


Ala 


Asp 


Gin 




190 






Leu 


Gly 


Arg 


Asp 


205 




Tyr 




Glu 


Leu 


Glu 


Lys 


Thr 


Pro 


Val 








240 


Met 


ASp 


Met 


Arg 






255 




Gly 


Gin 


Thr 


val 




270 






Ala 


Gin 


He 


Asp 


285 








Leu 


Asp 


Gly 


Val 


Gly 


Tyr 


Cys 


He 








320 


Ala 


Ala 


Arg 


Arg 






335 




Thr 


Leu 


Leu 


Asp 




350 






Glu 


Pro 


He 


He 


365 








Met 


Asp 


Gly 


Arg 


Ser 


Leu 


Tyr 


Trp 








400 


Pro 


Val 


Ala 


Phe 






415 




Ala 


Gly 


Lys 


Thr 




430 






Gin 


Leu 


val 


Lys 


445 








Gly 


Lys 


val 


Lys 


He 


Ala 


Leu 


Trp 








480 


Glu 


Gin 


Arg 


Phe 






495 




Asn 


Pro 


Pro 


Ala 




510 






Ala 


Glu 


Ser 


Pro 


525 








Ser 


Trp 


Trp 


Pro 


Ser 


Glu 


Pro 


val 








560 


Pro 


Gly 


His 


Tyr 






575 




Thr 


Glu 


Glu 


Asp 




590 
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10 



15 



30 



35 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix> FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .351 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



ATG 


ATG 


AAT 


ATG 


GAC 


GTG 


ATC 


AAG 


AGC 


TTT 


ACC 


GAG 


CAG 


ATG 


CAA 


GGC 


Met 


Met 


Asn 


Met 


Asp 


val 


He 


Lys 


Ser 


Phe 


Thr 


Glu 


Gin 


Met 


Gin 


Gly 


1 








5 










10 










15 




TTC 


GCC 


GCC 


CCC 


CTC 


ACC 


CGC 


TAC 


AAC 


CAG 


CTG 


CTG 


GCC 


AGC 


AAC 


ATC 


Phe 


Ala 


Ala 


Pro 
20 


Leu 


Thr 


Arg 


Tyr 


Asn 
25 


Gin 


Leu 


Leu 


Ala 


Ser 
30 


Asn 


He 


GAA 


CAG 


CTG 


ACC 


CGG 


TTG 


CAG 


CTG 


GCC 


TCC 


GCC 


AAC 


GCC 


TAC 


GCC 


GAA 


Glu 


Gin 


Leu 
35 


Thr 


Arg 


Leu 


Gin 


Leu 
40 


Ala 


Ser 


Ala 


Asn 


Ala 
45 


Tyr 


Ala 


Glu 


CTG 


GGC 


CTC 


AAC 


CAG 


TTG 


CAG 


GCC 


GTG 


AGC 


AAG 


GTG 


CAG 


GAC 


ACC 


CAG 


Leu 


Gly 
50 


Leu 


Asn 


Gin 


Leu 


Gin 
55 


Ala 


Val 


Ser 


Lys 


Val 
60 


Gin 


Asp 


Thr 


Gin 


AGC 


CTG 


GCG 


GCC 


CTG 


GGC 


ACA 


GTG 


CAA 


CTG 


GAG 


ACC 


GCC 


AGC 


CAG 


CTC 


Ser 


Leu 


Ala 


Ala 


Leu 


Gly 


Thr 


Val 


Gin 


Leu 


Glu 


Thr 


Ala 


Ser 


Gin 


Leu 


65 










70 










75 










80 


TCC 


CGC 


CAG 


ATG 


CTG 


GAT 


GAC 


ATC 


CAG 


AAG 


CTG 


AGC 


GCC 


CTC 


GGC 


CAG 


Ser 


Arg 


Gin 


Met 


Leu 
85 


ASp 


ASp 


He 


Gin 


Lys 
90 


Leu 


Ser 


Ala 


Leu 


Gly 
95 


Gin 


CAG 


TTC 


AAG 


GAA 


GAG 


CTG 


GAT 


GTC 


CTG 


ACC 


GCA 


GAC 


GGC 


ATC 


AAG 


AAA 


Gin 


Phe 


Lys 


Glu 
100 


Glu 


Leu 


ASp 


val 


Leu 
105 


Thr 


Ala 


Asp 


Gly 


He 
110 


Lys 


Lys 


AGC 


ACG 


GGC 


AAG 


GCC 


TGA 






















Ser 


Thr 


Gly 
115 


Lys 


Ala 

























(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

4 rt <C> STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

45 Met Met Asn Met Asp Val He Lys Ser Phe Thr Glu Gin Met Gin Gly 

15 10 15 

Phe Ala Ala Pro Leu Thr Arg Tyr Asn Gin Leu Leu Ala Ser Asn He 

20 25 30 

Glu Gin Leu Thr Arg Leu Gin Leu Ala Ser Ala Asn Ala Tyr Ala Glu 
35 40 45 

50 Leu Gly Leu Asn Gin Leu Gin Ala Val Ser Lys Val Gin Asp Thr Gin 

50 55 60 

Ser Leu Ala Ala Leu Gly Thr Val Gin Leu Glu Thr Ala Ser Gin Leu 



55 
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U@4S 



u@96 

20 GAA CAG CTG ACC CGG TTG CAG CTG GCC TCC GCC AAC GCC TAC GCC GAA 144 

192 

25 AGC CTG GCG GCC CTG GGC ACA GTG CAA CTG GAG ACC GCC AGC CAG CTC 240 

288 



336 



354 
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65 70 75 80 

Ser Arg Gin Met Leu Asp Asp lie Gin Lys Leu Ser Ala Leu Gly Gin 

85 90 95 

Gin Phe Lys Glu Glu Leu Asp Val Leu Thr Ala Asp Gly lie Lys Lys^ 
5 100 105 110 

Ser Thr Gly Lys Ala 
115 

2) INFORMATION FOR SEQ ID NO: 5: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 405 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . .402 



20 



25 



30 



35 



40 



45 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



ATG 


AGC 


GCA 


CAA 


TCC 


CTG 


GAA 


GTA 


GGC 


CAG 


AAG 


GCC 


CGT 


CTC 


AGC 


AAG 


48 


Met 


Ser 


Ala 


Gin 


ser 


Leu 


Glu 


Val 


Gly 


Gin 


Lys 


Ala 


Arg 


Leu 


Ser 


Lys 




1 








5 










10 










15 






CGG 


TTC 


GGG 


GCG 


GCG 


GAG 


GTA 


GCC 


GCC 


TTC 


GCC 


GCG 


CTC 


TCG 


GAG 


GAC 


96 


Arg 


Phe Gly 


Ala 


Ala 


Glu 


val 


Ala 


Ala 


Phe 


Ala 


Ala 


Leu 


Ser 


Glu 


Asp 










20 










25 










30 








TTC 


AAC 


CCC 


CTG 


CAC 


CTG 


GAC 


CCG 


GCC 


TTC 


GCC 


GCC 


ACC 


ACG 


GCG 


TTC 


144 


Phe 


Asn 


Pro 
35 


Leu 


His 


Leu 


Asp 


Pro 
40 


Ala 


Phe 


Ala 


Ala 


Thr 
45 


Thr 


Ala 


Phe 




GAG 


CGG 


CCC 


ATA 


GTC 


CAC 


GGC 


ATG 


CTG 


CTC 


GCC 


AGC 


CTC 


TTC 


TCC 


GGG 


192 


Glu 


Arg 
50 


Pro 


lie 


val 


His 


Gly 
55 


Met 


Leu 


Leu 


Ala 


Ser 
60 


Leu 


Phe 


Ser 


Gly 




CTG 


CTG 


GGC 


CAG 


CAG 


TTG 


CCG 


GGC 


AAG 


GGG 


AGC 


ATC 


TAT 


CTG 


GGT 


CAA 


240 


Leu 


Leu 


Gly 


Gin 


Gin 


Leu 


Pro 


Gly 


Lys 


Gly 


Ser 


He 


Tyr 


Leu 


Gly 


Gin 




65 










70 










75 










80 




AGC 


CTC 


AGC 


TTC 


AAG 


CTG 


CCG 


GTC 


TTT 


GTC 


GGG 


GAC 


GAG 


GTG 


ACG 


GCC 


288 


Ser 


Leu 


Ser 


Phe 


Lys 
85 


Leu 


Pro 


Val 


Phe 


Val 
90 


Gly 


Asp 


Glu 


Val 


Thr 
95 


Ala 




GAG 


GTG 


GAG 


GTG 


ACC 


GCC 


CTT 


CGC 


GAG 


GAC 


AAG 


CCC 


ATC 


GCC 


ACC 


CTG 


336 


Glu 


val 


Glu 


Val 
100 


Thr 


Ala 


Leu 


Arg 


Glu 
105 


Asp 


Lys 


Pro 


He 


Ala 
110 


Thr 


Leu 




ACC 


ACC 


CGC 


ATC 


TTC 


ACC 


CAA 


GGC 


GGC 


GCC 


CTC 


GCC 


GTG 


ACG 


GGG 


GAA 


384 


Thr 


Thr 


Arg 
115 


He 


Phe 


Thr 


Gin 


Gly 
120 


Gly 


Ala 


Leu 


Ala 


Val 
125 


Thr 


Gly 


Glu 




GCC 


GTG 


GTC 


AAG 


CTG 


CCT 


TAA 




















405 


Ala 


Val 
130 


val 


Lys 


Leu 


Pro 

























{2} INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 134 amino acids 

(B) TYPE: amino acid 
5Q (C) STRANDEDNESS: 

(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



55 



18 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



Met 


Ser 


Ala 


Gin 


Ser 


Leu 


Glu 


val 


Gly 


Gin 


Lvs 


Ala 


Arg 


Leu 


OfciX 


Lys 


1 








5 










10 










15 




Arg 


Phe 


Gly 


Ala 


Ala 


Glu 


Val 


Ala 


Ala 


Phe 


Ala 


Ala 


Leu 


Ser 


fll tl 
-1 u 


ASp 








20 










25 














Phe 


Asn 


Pro 


Leu 


His 


Leu 


ASp 


Pro 


Ala 


Phe 


Ala 


Ala Thr 


Thr 


Ala 


Phe 






35 










40 










45 








Glu 


Arg 


Pro 


lie 


Val 


His 




Met 


Leu 


Leu 


ni a 


Ser 


Leu 


Phe 


Ser 


Gly 




50 










55 










60 








Leu 


Leu 


Glv 


Gin 


Gin 


Leu 


Pro 




Lys 


Gly 




He 


Tyr 


Leu 


Gly 


Gin 


65 










70 










75 








80 


Ser 


Leu 


Ser 


Phe 


Lys 


Leu 


Pro 


Val 


Phe 


val 


Glv 


Asp 


Glu 


val 


Thr 


Ala 










85 










90 










95 




Glu 


val 


Glu 


val 


Thr 


Ala 


Leu 


Arg 


Glu 


Asp 


Lys 


Pro 


He 


Ala 


Thr 


Leu 








100 










105 










110 






Thr 


Thr 


Arg 


lie 


Phe 


Thr 


Gin 


Gly 


Gly Ala 


Leu 


Ala 


Val 


Thr Gly Glu 






115 










120 










125 








Ala 


val 


val 


Lys 


Leu 


Pro 
























130 






























(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 7 



















(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - -synthetic DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CCSCCSTGGA TCAAYAAGTW YTAYATC 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "synthetic DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

SAGCCASGCS GTCCARTCS G GCCACCA 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3187 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 



19 
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15 



20 



(A) NAME /KEY : CDS 

<b; LOCATION: 384. .734 

(ix) FEATURE: > 
5 {A; NAME/ KEY : CDS 

(B) LOCATION: 830.. 2611 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

AGATCTGGAC CGGGGTGCTG GCCTGGGCCA CGCCGGCGAG GGCCAGCGCG GAGCAACCGA 60 
10 GCAGCAGGGC GAGAGGTTTC ATCGGGATTC CTTGGCAGTC TGAATGACGT GCCAGCCTAT 120 

CAGCGCGGCG CCGGTGCGGC GAGGGCGCGC CGG AC CCAGT GCGTCACCTC TCGTCTGATC 180 
CGCCTCCCTC GACGGGCGTC GCTGACAAAA AAATTCAAAC AGAAATTAAC ATTTATGTCA 240 
TTTACACCAA ACCGCATTTG GTTGCAGAAT GCTCAAACGT GTGTTTGAAC AGAGCAAGCA 300 
ACACGT AAA C AGGGATGACA TGCAGTACCC GTAAGAAGGG CCGATTGGCC CACAACAACA 360 
CTGTTCTGCC GAACTGGAGA CCG ATG ATG AAT ATG GAC GTG ATC AAG AGC . 410 

Met Met Asn Met Asp Val lie Lys Ser 
1 5 

TTT ACC GAG CAG ATG CAA GGC TTC GCC GCC CCC CTC ACC CGC TAC AAC 458 
Phe Thr Glu Gin Met Gin Gly Phe Ala Ala Pro Leu Thr Arg Tyr Asn 
10 15 20 25 

CAG CTG CTG GCC AGC AAC ATC GAA CAG CTG ACC CGG TTG CAG CTG GCC 506 
Gin Leu Leu Ala Ser Asn lie Glu Gin Leu Thr Arg Leu Gin Leu Ala 

30 35 40 

TCC GCC AAC GCC TAC GCC GAA CTG GGC CTC AAC CAG TTG CAG GCC GTG 554 
Ser Ala Asn Ala Tyr Ala Glu Leu Gly Leu Asn Gin Leu Gin Ala Val 

45 50 55 

AGC AAG GTG CAG GAC ACC CAG AGC CTG GCG GCC CTG GGC ACA GTG CAA 602 
Ser Lys val Gin Asp Thr Gin Ser Leu Ala Ala Leu Gly Thr Val Gin 
25 60 65 70 

CTG GAG ACC GCC AGC CAG CTC TCC CGC CAG ATG CTG GAT GAC ATC CAG 650 
Leu Glu Thr Ala Ser Gin Leu Ser Arg Gin Met Leu Asp Asp lie Gin 

75 80 85 

AAG CTG AGC GCC CTC GGC CAG CAG TTC AAG GAA GAG CTG GAT GTC CTG 698 
Lys Leu Ser Ala Leu Gly Gin Gin Phe Lys Glu Glu Leu Asp Val Leu 
90 95 100 105 

ACC GCA GAC GGC ATC AAG AAA AGC ACG GGC AAG GCC T GAT AACCCC 744 
Thr Ala Asp Gly lie Lys Lys Ser Thr Gly Lys Ala 

110 115 
TGGCTGCCCG TTCGGGCAGC CACATCTCCC CATGACT CG A CGCTACGGGC TAGTTCCCGC 804 
CTCGGGTGTG GGTGAAGGAG AGCAC ATG AGC CAA CCA TCT TAT GGC CCG CTG* 856 

Met Ser Gin Pro Ser Tyr Gly Pro Leu 
1 5 

TTC GAG GCC CTG GCC CAC TAC AAT GAC AAG CTG CTG GCC ATG GCC AAG 9 04 
Phe Glu Ala Leu Ala His Tyr Asn Asp Lys Leu Leu Ala Met Ala Lys 
10 15 20 25 

GCC CAG ACA GAG CGC ACC GCC CAG GCG CTG CTG CAG ACC AAT CTG GAC 952 
Ala Gin Thr Glu Arg Thr Ala Gin Ala Leu Leu Gin Thr Asn Leu Asp 
4° 30 35 40 

GAT CTG GGC CAG GTG CTG GAG CAG GGC AGC CAG CAA CCC TGG CAG CTG 1000 
Asp Leu Gly Gin Val Leu Glu Gin Gly Ser Gin Gin Pro Trp Gin Leu 

45 50 55 

ATC CAG GCC CAG ATG AAC TGG TGG CAG GAT CAG CTC AAG CTG ATG CAG 1048 
lie Gin Ala Gin Met Asn Trp Trp Gin Asp Gin Leu Lys Leu Met Gin 
45 60 65 70 

CAC ACC CTG CTC AAA AGC GCA GGC CAG CCG AGC GAG CCG GTG ATC ACC 1096 
His Thr Leu Leu Lys Ser Ala Gly Gin Pro Ser Glu Pro Val lie Thr 

75 80 85 

CCG GAG CGC AGC GAT CGC CGC TTC AAG GCC GAG GCC TGG AGC GAA CAA 1144 
Pro Glu Arg Ser Asp Arg Arg Phe Lys Ala Glu Ala Trp Ser Glu Gin 
90 95 100 105 

CCC ATC TAT GAC TAC CTC AAG CAG TCC TAC CTG CTC ACC GCC AGG CAC 1192 
Pro lie Tyr Asp Tyr Leu Lys Gin Ser Tyr Leu Leu Thr Ala Arg His 

110 115 120 

CTG CTG GCC TCG GTG GAT GCC CTG GAG GGC GTC CCC CAG AAG AGC CGG 1240 

55 
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Leu Leu Ala Ser val Asp Ala Leu Glu Gly Val Pro Gin Lys Ser Arg 

125 130 135 

GAG CGG CTG CGT TTC TTC ACC CGC CAG TAC GTC AAC GCC ATG GCC CCC 1288 
Glu Arg Leu Arg Phe Phe Thr Arg Gin Tyr Val Asn Ala Met Ala Pro 
5 140 145 150 

AGC AAC TTC CTG GCC ACC AAC CCC GAG CTG CTC AAG CTG ACC CTG GAG 1336 
Ser Asn Phe Leu Ala Thr Asn Pro Glu Leu Leu Lys Leu Thr Leu Glu 

155 160 165 

TCC GAC GGC CAG AAC CTG GTG CGC GGA CTG GCC CTC TTG GCC GAG GAT 1384 
Ser Asp Gly Gin Asn Leu val Arg Gly Leu Ala Leu Leu Ala Glu Asp 
10 170 175 180 185 

CTG GAG CGC AGC GCC GAT CAG CTC AAC ATC CGC CTG ACC GAC GAA TCC 1432 
Leu Glu Arg Ser Ala Asp Gin Leu Asn lie Arg Leu Thr Asp Glu Ser 

190 195 200 

GCC TTC GAG CTC GGG CGG GAT CTG GCC CTG ACC CCG GGC CGG GTG GTG 1480 
Ala Phe Glu Leu Gly Arg Asp Leu Ala Leu Thr Pro Gly Arg Va; val 
15 205 210 215 

CAG CGC ACC GAG CTC TAT GAG CTC ATT CAG TAC AGC CCG ACT ACC GAG 1528 
Gin Arg Thr Glu Leu Tyr Glu Leu lie Gin Tyr Ser Pro Thr Thr Glu 

220 225 230 

ACG GTG GGC AAG ACA CCT GTG CTG ATA GTG CCG CCC TTC ATC AAC AAG 1576 
Thr val Gly Lys Thr Pro Val Leu He Val Pro Pro Phe lie Asn Lys 

235 240 245 

TAC TAC ATC ATG GAC ATG CGG CCC CAG AAC TCC CTG GTC GCC TGG CTG 1624 
Tyr Tyr He Met Asp Met Arg Pro Gin Asn Ser Leu Val Ala Trp Leu 
250 255 260 265 

GTC GCC CAG GGC CAG ACG GTA TTC ATG ATC TCC TGG CGC AAC CCG GGC 1672 
Val Ala Gin Gly Gin Thr val Phe Met He Ser Trp Arg Asn Pro Gly 
270 275 280 

25 GTG GCC CAG GCC CAA ATC GAT CTC GAC GAC TAC GTG GTG GAT GGC GTC 1720 

val Ala Gin Ala Gin He Asp Leu Asp Asp Tyr Val Val Asp Gly val 

285 290 295 

ATC GCC GCC CTG GAC GGC GTG GAG GCG GCC ACC GGC GAG CGG GAG GTG 1768 
He Ala Ala Leu Asp Gly Val Glu Ala Ala Thr Gly Glu Arg Glu Val 
300 305 310 

30 CAC GGC ATC GGC TAC TGC ATC GGC GGC ACC GCC CTG TCG CTC GCC ATG 1816 

His Gly lie Gly Tyr Cys He Gly Gly Thr Ala Leu Ser Leu Ala Met 

315 320 325 

GGC TGG CTG* GCG GCG CGG CGC CAG AAG CAG CGG GTG CGC ACC GCC ACC 1864 
Gly Trp Leu Ala Ala Arg Arg Gin Lys Gin Arg Val Arg Thr Ala Thr 
330 335 340 345 

CTG TTC ACT ACC CTG CTG GAC TTC TCC CAG CCC GGG GAG CTT GGC ATC 1912 
Leu Phe Thr Thr Leu Leu Asp Phe Ser Gin Pro Gly Glu Leu Gly He 

350 355 360 

TTC ATC CAC GAG CCC ATC ATA GCG GCG CTC GAG GCG CAA AAT GAG GCC 1960 
Phe He His Glu Pro He He Ala Ala Leu Glu Ala Gin Asn Glu Ala 

365 370 375 

AAG GGC ATC ATG GAC GGG CGC CAG CTG GCG GTC TCC TTC AGC CTG CTG 2008 
Lys Gly He Met Asp Gly Arg Gin Leu Ala Val Ser Phe Ser Leu Leu 

380 385 390 

CGG GAG AAC AGC CTC TAC TGG AAC TAC TAC ATC GAC AGC TAC CTC AAG 2056 
Arg Glu Asn Ser Leu Tyr Trp Asn Tyr Tyr He Asp Ser Tyr Leu Lys 

395 400 405 

GGT CAG AGC CCG GTG GCC TTC GAT CTG CTG CAC TGG AAC AGC GAC AGC 2104 
45 Gly Gin Ser Pro Val Ala Phe Asp Leu Leu His Trp Asn Ser Asp Ser 

410 415 420 425 

ACC AAT GTG GCG GGC AAG ACC CAC AAC AGC CTG CTG CGC CGT CTC TAC 2152 
Thr Asn val Ala Gly Lys Thr His Asn Ser Leu Leu Arg Arg Leu Tyr 

430 435 440 

CTG GAG AAC CAG CTG GTG AAG GGG GAG CTC AAG ATC CGC AAC ACC CGC 2 200 
so Leu Glu Asn Gin Leu Val Lys Gly Glu Leu Lys He Arg Asn Thr Arg 

445 450 455 

ATC GAT CTC GGC AAG GTG AAG ACC CCT GTG CTG CTG GTG TCG GCG GTG 224 8 
He Asp Leu Gly Lys Val Lys Thr Pro val Leu Leu Val Ser Ala Val 
460 465 470 

55 
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GAC. 


GAT 


CAC 


ATC 


GCC 


CTC 


TGG 


CAG 


GGC 


ACC 


TGG 


CAG 


GGC 


ATG 


AAG 


CTG 


2296 


Xsp 


Asp 
475 


His 


He 


Ala 


Leu 


Trp 
480 


Gin 


Gly 


Thr 


Trp 


Gin 
485 


Gly 


Met 


Lys 


Leu 




rtT 


GGC 


GGG 


GAG 


CAG 


CGC 


TTC 


CTC 


CTG 


GCG 


GAG 


TCC 


GGC 


CAC 


ATC 


GCC > 


2344 


Phe 


Gly 


Gly 


Glu 


Gin 


Arg 


Phe 


Leu 


Leu 


Ala 


Glu 


Ser 


Gly 


His 


He 


Ala 




490 










495 










500 










505 




GGC 


ATC 


ATC 


AAC 


CCG 


CCG 


GCC 


GCC 


AAC 


AAG 


TAC 


GGC 


TTC 


TGG 


CAC 


AAC 


2392 


Gly 


lie 


He 


Asn 


Pro 
510 


Pro 


Ala 


Ala 


Asn 


Lys 
515 


Tyr 


Gly 


Phe 


Trp 


His 
520 


Asn 




GGG 


GCC 


GAG 


GCC 


GAG 


AGC 


CCG 


GAG 


AGC 


TGG 


CTG 


GCA 


GGG 


GCG 


ACG 


CAC 


2440 


Gly 


Ala 


Glu 


Ala 
525 


Glu 


ser 


Pro 


Glu 


Ser 
530 


Trp 


Leu 


Ala 


Gly 


Ala 
535 


Thr 


His 




CAG 


GGC 


GGC 


TCC 


TGG 


TGG 


CCC 


GAG 


ATG 


ATG 


GGC 


TTT 


ATC 


CAG 


AAC 


CGT 


2488 


Gin 


Gly 


Gly 
540 


Ser 


Trp 


Trp 


Pro 


Glu 
545 


Met 


Met 


Gly 


Phe 


He 
550 


Gin 


Asn 


Arg 




GAC 


GAA 


GGG 


TCA 


GAG 


CCC 


GTC 


CCC 


GCG 


CGG 


GTC 


CCG 


GAG 


GAA 


GGG 


CTG 


2536 


Asp 


Glu 
555 


Gly 


Ser 


Glu 


Pro 


val 
560 


Pro 


Ala 


Arg 


val 


Pro 
565 


Glu 


Glu 


Gly 


Leu 




GCC 


CCC 


GCC 


CCC 


GGC 


CAC 


TAT 


GTC 


AAG 


GTG 


CGG 


CTC 


AAC 


CCC 


GTG 


TTT 


2584 


Ala 


Pro 


Ala 


Pro 


Gly 


His 


Tyr 


Val 


Lys 


Val 


Arg 


Leu 


Asn 


Pro 


val 


Phe 




570 










575 










580 










585 




GCC 


TGC 


CCA 


ACA 


GAG 


GAG 


GAC 


GCC 


GCA 


TGAGCGCACA AT CCCTGG AA 




2631 


Ala 


Cys 


Pro 


Thr 


Glu 


Glu 


Asp 


Ala 


Ala 



















20 590 

GTAGGCCAGA AGGCCCGTCT CAGCAAGCGG TTCGGGGCGG CGGAGGTAGC CGCCTTCGCC 2691 
GCGCTCT CGG AGGACTTCAA CCCCCTGCAC CTGGA CCCGG CCTTCGCCGC CACCACGGCG 2751 
TTCGAGCGGC CCATAGTCCA CGGCATGCTG CTCGCCAGCC TCTTCTCCGG GCTGCTGGGC 2811 
CAGCAGTTGC CGGGCAAGGG GAGCATCTAT CTGGGTGAAA GCCTCAGCTT CAAGCTGCCG 2871 
GTCTTTGTCG GGGACGAGGT GACGGCCGAG GTGGAGGTGA CCGCCCTTCG CGAGGACAAG 2931 

25 CCCATCGCCA CCCTGACCAC CCGCATCTTC ACCCAAGGCG GCGCCCTCGC CGTGACGGGG 2991 

GAAGCCGTGG TCAAGCTGCC TTAAGCACCG GCGGCACGCA GGCACAATCA GCCCGGCCCC 3051 
TGCCGGGCTG ATTGTT CTCC CCCGCTCCGC TTGCCCCCTT TTTCGGGGCA ATTTGGCCCA 3111 
GGCCCTTTCC CTGCCCCGCC TAACTGCCTA AAATGGCCGC CCTGCCGTGT AGGCATTCAT 3171 
CCAGCTAGAG GAA TTC 3187 

30 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3187 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii| MOLECULE TYPE: DNA (genomic) 

40 (ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 2611.. 3012 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

45 AGATCTGGAC CGGGGTGCTG GCCTGGGCCA CGCCGGCGAG GGCCAGCGCG GAGCAACCGA 60 

G CAG CAG GGC GAGAGGTTTC ATCGGGATTC CTTGGCAGTC TGAATGACGT GCCAGCCTAT 120 

CAGCGCGGCG CCGGTGCGGC GAGGGCGCGC CGGACCCAGT GCGTCACCTC TCGTCTGATC 180 

CGCCTCCCTC GACGGGCGTC GCTGACAAAA AAATTCAAAC AGAAATTAAC ATTTATGTCA 240 

TTTACACCAA ACCGCATTTG GTTGCAGAAT GCT CAAACGT GTGTTTGAAC AGAGCAAGCA 300 

ACACGTAAAC AGGGATGACA TGCAGTACCC GTAAGAAGGG CCGATTGGCC CACAACAACA 360 

CTGTTCTGCC GAACTGGAGA CCGATGATGA ATATGGACGT GATCAAGAG C TTTACCGAGC 420 

AGATGCAAGG CTTCGCCGCC CCCCTCACCC GCTACAACCA GCTGCTGGCC AGCAA CAT CG 480 

AACAGCTGAC CCGGTTGCAG CTGGCCTCCG CCAACGCCTA CGCCGAACTG GGCCTCAACC 540 

AGTTGCAGGC CGTGAGCAAG GTGCAGGACA CCCAGAGCCT GGCGGCCCTG GGCACAGTGC 600 

AACTGGAGAC CGCCAGCCAG CTCTCCCGCC AGATG CTGGA T GACATCCAG AAGCTGAGCG 660 

55 
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CCCTCGGCCA GCAGTT CAAG GAAGAGCTGG ATGTCCTGAC CGCAGACGGC ATCAAGAAAA 720 
* GCACGGGCAA GGCCTGATAA CCCCTGGCTG CCCGTTCGGG CAGCCACATC TCCCCATGAC 780 
TCGACGCTAC GGGCTAGTTC CCGCCTCGGG TGTGGGTGAA GGAGAGCACA TGAGCCAACC 840 
ATCTTATGGC CCGCTGTTCG AGGCCCTGGC CCACTACAAT GACAAGCTGC TGGCCATG5C 900 
CAAGGCCCAG ACAGAGCGCA CCGCCCAGGC GCTGCTGCAG ACCAAT CTGG ACGATCTGGG 9 60 
CCAGGTGCTG GAGCAGGGCA GCCAGCAACC CTGGCAGCTG ATCCAGGCCC AGATGAACTG 1020 
GTGGCAGGAT CAGCT CAAG C TGATGCAGCA CACCCTGCTC AAAAOtGCAG GCCAGCCGAG 1080 
CGAGCCGGTG ATCACCCCGG AGCGCAGCGA TCGCCGCTTC AAGGCCGAGG CCTGGAGCGA 1140 
ACAACCCATC TATGACTACC TCAAGCAGTC CTACCTGCTC ACCGCCAGGC ACCTGCTGGC 1200 
CTCGGTGGAT GCCCTGGAGG GCGTCCCCCA GAAGAGCCGG GAGCGGCTGC GTTTCTTCAC 1260 
CCGCCAGTAC GTCAACGCCA TGGCCCCCAG CAACTTCCTG GCCACCAACC CCGAGCTGCT 1320 
CAAGCTGACC CTGGAGT CCG ACGGCCAGAA CCTGGTGCGC GGACTGG CCC TCTTGGCCGA 1380 
GGATCTGGAG CGCAGCGCCG AT CAGCT CAA CATCCGCCTG ACCGACGAAT CCGCCTTCGA 1440 
GCTCGGGCGG GATCTGGCCC TG A CCCCGGG CCGGGTGGTG CAGCGCACCG AGCTCTATGA 15 00 
GCTCATTCAG TACAGCCCGA CTACCGAGAC GGTGGGCAAG ACACCTGTGC TGATAGTGCC 1560 
GCCCTTCATC AACAAGTACT ACAT CAT GGA CATGCGGCCC CAGAACTCCC TGGTGGCCTG 1620 
GCTGGT CGCC CAGGGCCAGA CGGTATTCAT GATCT CCTGG CGCAACCCGG GCGTGGCCCA 1680 
GGCCCAAATC GATCTCGACG ACT ACGTGGT GGATGGCGTC ATCGCCGCCC TGGACGGCGT 1740 
GGAGGCGGCC ACCGGCGAGC GGGAGGTGCA CGGCATCGGC TACTGCATCG GCGGCACCGC 1800 
CCTGTCGCTC GCCATGGGCT GGCTGGCGGC GCGGCGCCAG AAGCAGCGGG TGCGCACCGC 1860 
CACCCTGTTC ACTACCCTGC TGGACTTCTC CCAGCCCGGG GAGCTTGGCA TCTT CAT CCA 19 20 
CGAGCCCATC ATAGCGGCGC TCGAGGCGCA AAATGAGGCC AAGGGCATCA TGGACGGGCG 1980 
CCAGCTGGCG GTCTCCTTCA GCCTGCTGCG GGAGAACAGC CTCTACTGGA ACTACTACAT 2040 
CGACAGCTAC CTCAAGGGTC AGAGCCCGGT GGCCTTCGAT CTGCTGCACT GGAACAG CG A 2100 
CAGCACCAAT GTGGCGGGCA AGACCCACAA CAGCCTGCTG CGCCGTCTCT ACCTGGAGAA 2160 
CCAGCTGGTG AAGGGGGAGC T CAAG AT CCG CAACACCCGC ATCGATCTCG GCAAGGTGAA 22 20 
GACCCCTGTG CTGCTGGTGT CGGCGGTGGA CGATCACATC GCCCTCTGGC AGGGCACCTG 2280 
GCAGGGCATG AAGCTGTTTG GCGGGGAGCA GCGCTTCCTC CTGGCGGAGT CCGGCCACAT 2340 
CGCCGGCATC ATCAACCCGC CGGCCGCCAA CAAGTACGGC TTCTGGCACA ACGGGGCCGA 24 00 
GGCCGAGAGC CCGGAGAGCT GGCTGGCAGG GGCGACGCAC CAGGGCGGCT CCTGG'TGGCC 2460 
CGAGATGATG GGCTTTATCC AGAACCGTGA CGAAGGGTCA GAGCCCGTCC CCGCGCGGGT 2520 
CCCGGAGGAA GGGCTGGCCC CCGCCCCCGG CCACTATGTC AAGGTGCGGC TCAACCCCGT 2580 
GTTTGCCTGC CCAACAGAGG AGGACGCCGC ATG AGC GCA CAA TCC CTG GAA GTA 2634 



Met Ser Ala Gin Ser Leu Glu Val 
1 5 



GGC 


CAG 


AAG 


GCC 


CGT 


CTC 


AGC 


AAG 


CGG 


TTC 


GGG 


GCG 


GCG 


GAG 


GTA 


GCC 


2682 


Gly 


Gin 


Lys 


Ala 


Arg 


Leu 


Ser 


Lys 


Arg 


Phe 


Gly 


Ala 


Ala 


Glu 


Val 


Ala 
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GCC 


TTC 


GCC 


GCG 


CTC 


TCG 


GAG 


GAC 


TTC 


AAC 


CCC 


CTG 


CAC 


CTG 


GAC 


CCG 


2730 


Ala 


Phe 


Ala 


Ala 


Leu 


Ser 


Glu 


Asp 


Phe 


Asn 


Pro 


Leu 


His 


Leu 


Asp 


Pro 
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GCC 


TTC 


GCC 


GCC 


ACC 


ACG 


GCG 


TTC 


GAG 


CGG 


CCC 


ATA 


GTC 


CAC 


GGC 


ATG 


2778 


Ala 


Phe 


Ala 


Ala 


Thr 


Thr 


Ala 


Phe 


Glu 


Arg 


Pro 


He 


val 


His 


Gly 


Met 
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CTG 


ere 


GCC 


AGC 


CTC 


TTC 


TCC 


GGG 


CTG 


CTG 


GGC 


CAG 


CAG 


TTG 


CCG 


GGC 


2826 


Leu 


Leu 


Ala 


Ser 


Leu 


Phe 


Ser 


Gly 


Leu 


Leu 


Gly 


Gin 


Gin 


Leu 


Pro 


Gly 










60 










65 










70 








AAG 


GGG 


AGC 


ATC 


TAT 


CTG 


GGT 


CAA 


AGC 


CTC 


AGC 


TTC 


AAG 


CTG 


CCG 


GTC 


2874 
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Ser 
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Ser 
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Pro 


val 
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TTT 
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GGG 


GAC 
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ACG 


GCC 
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GTG 
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GTG 


ACC 


GCC 


CTT 


CGC 


2922 
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val 


Gly 
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Glu 


Val 


Thr 


Ala 
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Val 
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val 


Thr 


Ala 


Leu 


Arg 
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ATC 


GCC 


ACC 


CTG 


ACC 


ACC 


CGC 


ATC 


TTC 


ACC 


CAA 


GGC 


2970 


Glu 


Asp 


Lys 


Pro 


He 


Ala 


Thr 


Leu 


Thr 


Thr 


Arg 


He 


Phe 


Thr 


Gin 


Gly 
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115 










120 
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GGG 
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GTG 
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AAG 


CTG 


CCT 






3012 


Gly 


Ala 


Leu 


Ala 


val 


Thr 


Gly 


Glu 


Ala 


val 


Val 


Lys 


Leu 


Pro 









125 130 



TAAGCACCGG CGGCACGCAG GCACAATCAG CCCGGCCCCT GCCGGGCTGA TTGTTCTCCC 3072 
CCGCTCCGCT TGCCCCCTTT TTCGGGGCAA TTTGG CCCAG GCCCTTTCCC TGCCCCGCCT 3132 
AACTGCCTAA AATGGCCGCC CTGCCGTGTA GGCATT CATC CAGCT AGAGG AATTC 3187 



(2) INFORMATION FOR SEQ ID NO: 11: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid r ' 

(A) DESCRIPTION: /desc - "synthetic DNA" 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AGTTCCCGCC TCGGGTGTGG GTGAA 



(2) INFORMATION FOR SEQ ID NO: 12: 

, 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "synthetic DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GGCATATGCG CTCATGCGGC GTCCT 2 5 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "synthetic DMA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GCCATATGAG CGCACAATCC CTGGAAGTAG 



(2) INFORMATION FOR SEQ ID NO: 14: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "synthetic DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CTGGGATCCG CCGGTGCTTA AGGCAGCTTG 

(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH : 20 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Ser Ala Gin Ser Leu Glu Val Gly Gin Lys Ala Arg Leu Ser Lys Arg 

1 5 10 15 

Phe Gly Ala Ala 
20 

(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Ser Ala Gin Ser Leu Glu val Gly Gin Lys Ala Arg Leu Ser Lys 

15 10 15 

Arg Phe Gly Ala Ala 
20 



Claims 



1. A polyester synthase gene coding for a polypeptide containing the amino acid sequence of SEQ ID NO:2 or a 
sequence where in said amino acid sequence, one or more amino acids are deleted, replaced or added, said 

40 polypeptide bringing about polyester synthase activity. 

2. A polyester synthase gene comprising the nucleotide sequence of SEQ ID NO:1. 

3. A gene expression cassette comprising the polyester synthase gene of claims 1 or 2 and either of open reading 
45 frames located upstream and downstream of said gene. 

4. The gene expression cassette according to claim 3, wherein the open reading frame located upstream of the poly- 
ester synthase gene comprises DNA coding for the amino acid sequence of SEQ ID NO:4. 



The gene expression cassette according to claim 3, wherein the open reading frame located upstream of the poly- 
ester synthase gene comprises the nucleotide sequence of SEQ ID NO:3. 



6. The gene expression cassette according to claim 3, wherein the open reading frame located downstream of the 
polyester synthase gene comprises DNA coding for a polypeptide containing the amino acid sequence of SEQ ID 

55 NO:6 or a sequence where in said amino acid sequence, one or more amino acids are deleted, replaced or added, 
said polypeptide bringing about enoyl-CoA hydratase activity. 

7. The gene expression cassette according to claim 3, wherein the open reading frame located downstream of the 
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polyester synthase gene comprises the nucleotide sequence of SEQ ID NO:5. 

8. A recombinant vector comprising the polyester synthase gene of claim 1 or 2 or the gene expression cassette of 
any one of claims 3 to 7. 

9. A transformant transformed with the recombinant vector of claim 8. 

10. A process for producing polyester, wherein the transformant of claim 9 is cultured in a medium and polyester is 
recovered from the resulting culture. 

11. The process for producing polyester according to claim 10, wherein the polyester is a copolymer of 3-hydroxyalka- 
noic acid represented by formula I: 

R 

HO — CH - CH 2 — COOH 

wherein R represents a hydrogen atom or a C1 to C4 alkyl group. 

12. The process for producing polyester according to claim 10, wherein the polyester is a poly(3-hydroxybutyrate-co-3- 
hydroxyhexanoate) random copolymer. 
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FIG. 1A 
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FIG.2 

M 1 2 



94 kDa 
67 kDa 



43 kDa 



30 kDa 



21.1 kDa +^ 



14.4 kDa 



Lane M: molecular-weight marker 

Lane 1: soluble-protein fraction from NB3 

Lane 2: active fraction eluted from the anion 
exchange column 
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(54) Polyester synthase gene and process for producing polyester 

(57) The present invention relates to a polyester 
synthase gene coding for a polypeptide containing the 
amino acid sequence of SEQ ID NO:2 or a sequence 
where in said amino acid sequence, one or more amino 
acids are deleted, replaced or added, said polypeptide 
bringing about polyester synthase activity; a gene 
expression cassette comprising the polyester synthase 
gene and either of open reading frames located 
upstream and downstream of said gene; a recombinant 
vector comprising the gene expression cassette; a 
transformant transformed with the recombinant vector; 
and a process for producing polyester by culturing the 
transformant in a medium and recovering polyester from 
the resulting culture. 
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